Investing 101: A visual and predictive guide for the rookie investor
With the Federal Reserve once again cutting interest rates to zero and embarking on massive quantitative easing measures in response to the economic malaise brought about by COVID-19, retail investors are becoming increasingly aware that leaving their hard-earned money in the bank is financially imprudent.
However, it may seem daunting for first-time investors looking to enter the market, with so many finance websites offering different resources and opinions. Our application looks to guide the beginner investor in their journey to begin investing by zeroing in on a select few stocks, commodities and bonds.
Existing financial data websites such as Yahoo Finance do a good job in providing historical price data and technical indicators, but the beginner investor lacks the knowledge to properly utilise and benefit from these. For example, there is a whole plethora of technical indicators available for use, but whether the investor is able to truly understand and use them properly is questionable. We have also identified a gap in such websites, that they do not provide any form of forecasting or cross-asset analysis to aid in investors’ decisions. In addition, it may be informative for an investor to see how macro events on a global scale may affect assets.
This project aims to utilise various R packages, mainly tidyverse, TTR and dtwcluster in building an interactive R Shiny application. The R Shiny application (details in section 2) will be an integrated dashboard which allows users to perform different types of analysis while aiding them in making informed decisions in portfolio construction.
Financial Asset Universe - The list of assets covered in our application is shown below:
| S/N | Stock Market Indices | Commodities | Bonds |
|---|---|---|---|
| 1 | US | Gold | Investment Grade |
| 2 | UK | Silver | High-Yield |
| 3 | Germany | Copper | |
| 4 | France | Iron | |
| 5 | Italy | Natural Gas | |
| 6 | Spain | Crude Oil | |
| 7 | Brazil | ||
| 8 | Singapore | ||
| 9 | Hong Kong | ||
| 10 | Japan | ||
| 11 | China | ||
| 12 | Australia | ||
| 13 | Korea | ||
| 14 | India |
We will extract all historical price data from Jan 1 1997 to Dec 31 2020 for our selected assets using Bloomberg Terminal. The fields which we will extract data for include:
Open Price: The asset’s opening price on a trading day
High Price: The asset’s highest price on a trading day
Low Price: The asset’s lowest price on a trading day
Close Price: The asset’s closing price on a trading day
The main target audience for our application are prospective investors who are looking for a tool to help them build their portfolio and diversify their holdings. This section explores features of our application and some ways it can help our users. The application itself is split up into 3 main sections address different parts of the investing process.
Base visualizations and analysis of key financial asset returns metrics
Exploratory Data Analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. It will be useful for potential investors to have a visual tool to understand various key metrics when analysing financial assets. Some of the key metrics our application aims to help users visualize include:
Possible insights: Diversification through identifying weakly correlated markets and indices. This allows the user to ‘put their eggs across different baskets’ and spread out their risk.
Predicting financial asset prices using an ARIMA model
ARIMA is an acronym that stands for AutoRegressive Integrated Moving Average. It is a class of model that captures a suite of different standard temporal structures in time series data.
AR: Autoregression. A regression model that uses the dependencies between an observation and a number of lagged observations.
I: Integrated. To make the time series stationary by measuring the differences of observations at different time.
MA: Moving Average. An approach that takes into accounts the dependency between observations and the residual error terms when a moving average model is used to the lagged observations (q).
Along with various parameters to be tuned, an ARIMA model uses purely price action data for forecasting future prices.
The general form of an ARIMA model is denoted as ARIMA (p, q, d). With seasonal time series data, it is likely that short-run non-seasonal components contribute to the model. ARIMA model is typically represented as ARIMA (p, q, d), where: —
Possible insights: Forecast what are the likely price movements in the period ahead in order to identify optimal entry points
Clustering financial assets based on historical returns
Time series clustering is to partition time series data into groups based on similarity or distance, so that time series in the same cluster are similar. Dynamic Time Warping (DTW) is used as the similarity/distance metric and finds optimal alignment between two time series.
The first step is to work out an appropriate distance/similarity metric (i.e. DTW).
The second step, use existing clustering techniques, such as k-means, hierarchical clustering, density-based clustering or subspace clustering, to find clustering structures.
The final result will then be presented on the visualization.
Possible insights: Time series in the same cluster are likely to undergo similar cycles and risks.This means that users can expect similar movements across cycles/periods and act accordingly.