Project Proposal

Investing 101: A visual and predictive guide for the rookie investor

Andre Lee | Boey Yi Heen | Ng Weekien
02-28-2021

1.0 Overview

With the Federal Reserve once again cutting interest rates to zero and embarking on massive quantitative easing measures in response to the economic malaise brought about by COVID-19, retail investors are becoming increasingly aware that leaving their hard-earned money in the bank is financially imprudent.

However, it may seem daunting for first-time investors looking to enter the market, with so many finance websites offering different resources and opinions. Our application looks to guide the beginner investor in their journey to begin investing by zeroing in on a select few stocks, commodities and bonds.

2.0 Motivation and Objectives

Existing financial data websites such as Yahoo Finance do a good job in providing historical price data and technical indicators, but the beginner investor lacks the knowledge to properly utilise and benefit from these. For example, there is a whole plethora of technical indicators available for use, but whether the investor is able to truly understand and use them properly is questionable. We have also identified a gap in such websites, that they do not provide any form of forecasting or cross-asset analysis to aid in investors’ decisions. In addition, it may be informative for an investor to see how macro events on a global scale may affect assets.

This project aims to utilise various R packages, mainly tidyverse, TTR and dtwcluster in building an interactive R Shiny application. The R Shiny application (details in section 2) will be an integrated dashboard which allows users to perform different types of analysis while aiding them in making informed decisions in portfolio construction.

3.0 Scope

Financial Asset Universe - The list of assets covered in our application is shown below:

S/N Stock Market Indices Commodities Bonds
1 US Gold Investment Grade
2 UK Silver High-Yield
3 Germany Copper
4 France Iron
5 Italy Natural Gas
6 Spain Crude Oil
7 Brazil
8 Singapore
9 Hong Kong
10 Japan
11 China
12 Australia
13 Korea
14 India

4.0 Data

We will extract all historical price data from Jan 1 1997 to Dec 31 2020 for our selected assets using Bloomberg Terminal. The fields which we will extract data for include:

Open Price: The asset’s opening price on a trading day

High Price: The asset’s highest price on a trading day

Low Price: The asset’s lowest price on a trading day

Close Price: The asset’s closing price on a trading day

5.0 Application Features

The main target audience for our application are prospective investors who are looking for a tool to help them build their portfolio and diversify their holdings. This section explores features of our application and some ways it can help our users. The application itself is split up into 3 main sections address different parts of the investing process.

5.1 Exploratory Data Analysis (EDA)

Base visualizations and analysis of key financial asset returns metrics

Exploratory Data Analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. It will be useful for potential investors to have a visual tool to understand various key metrics when analysing financial assets. Some of the key metrics our application aims to help users visualize include:

  1. Correlation
    • Correlation matrix to show correlation between selected assets at different time horizons
  2. Distribution of returns
    • Histogram showing distribution of returns for different assets will allow users to visualize volatility
  3. Risk to Reward
    • Purely looking at gross returns often does not paint the full picture as most high-returns assets come with greater volatility. Metrics such as Sharpe - Ratio will allow users to get a better sense of the Risk:Reward profile of different assets
  4. Technical indicators
    • Technical analysis is a trading discipline employed to evaluate investments and identify trading opportunities by analyzing statistical trends gathered from trading activity, such as price movement and volume.
    • Our application will categorize indicators based on their category (trend, momentum or volatility) so as to allow users to know precisely what the indicator is looking at. This is in contrast to many existing websites where users have a whole plethora of indicators at their disposal but not knowing what they are actually evaluating.

Possible insights: Diversification through identifying weakly correlated markets and indices. This allows the user to ‘put their eggs across different baskets’ and spread out their risk.

5.2 Time-Series Forecasting

Predicting financial asset prices using an ARIMA model

ARIMA is an acronym that stands for AutoRegressive Integrated Moving Average. It is a class of model that captures a suite of different standard temporal structures in time series data.

Along with various parameters to be tuned, an ARIMA model uses purely price action data for forecasting future prices.

The general form of an ARIMA model is denoted as ARIMA (p, q, d). With seasonal time series data, it is likely that short-run non-seasonal components contribute to the model. ARIMA model is typically represented as ARIMA (p, q, d), where: —

Possible insights: Forecast what are the likely price movements in the period ahead in order to identify optimal entry points

5.3 Time-Series Clustering

Clustering financial assets based on historical returns

Time series clustering is to partition time series data into groups based on similarity or distance, so that time series in the same cluster are similar. Dynamic Time Warping (DTW) is used as the similarity/distance metric and finds optimal alignment between two time series.

  1. The first step is to work out an appropriate distance/similarity metric (i.e. DTW).

  2. The second step, use existing clustering techniques, such as k-means, hierarchical clustering, density-based clustering or subspace clustering, to find clustering structures.

  3. The final result will then be presented on the visualization.

Possible insights: Time series in the same cluster are likely to undergo similar cycles and risks.This means that users can expect similar movements across cycles/periods and act accordingly.

6.0 Tools and Packages

S/N Package Documentation
1 tidyverse https://www.tidyverse.org/
2 dplyr https://www.rdocumentation.org/packages/dplyr/versions/0.7.8
3 TTR https://cran.r-project.org/web/packages/TTR/index.html
4 plotly https://www.rdocumentation.org/packages/plotly/versions/4.9.3/topics/plot_ly
5 ggplot2 https://www.rdocumentation.org/packages/ggplot2/versions/3.3.3
6 Shiny https://www.rdocumentation.org/packages/shiny/versions/1.6.0
7 dtwcluster https://www.rdocumentation.org/packages/dtwclust/versions/3.1.1
8 dygraphs https://www.rdocumentation.org/packages/dygraphs/versions/1.1.1.6
9 tidyquant https://cran.r-project.org/web/packages/tidyquant/index.html
10 readr https://www.rdocumentation.org/packages/readr/versions/1.3.1
11 rticles https://www.rdocumentation.org/packages/rticles/versions/0.18

7.0 References

  1. A benchmark study on time series clustering

  2. A Scalable Method for Time Series Clustering

  3. Technical Analysis with R

  4. Examination of the profitability of technical analysis based on moving average strategies in BRICS

  5. Stock Price Prediction Using the ARIMA Model