Categoría: Finance, Business, Management, Economics and Accounting

ORIGINAL

Utilizing Machine Learning and Deep Learning for Predicting Crypto-currency Trends

Utilización del aprendizaje automático y el aprendizaje profundo para predecir las tendencias de las criptomonedas

Ahmed El Youssefi¹*, Abdelaaziz Hessane¹*, Imad Zeroual¹*, Yousef Farhaoui¹*

¹STI laboratory, T-IDMS, Faculty of Sciences and Techniques of Errachidia, Moulay Ismail University of Meknès. Morocco.

Cite as: Youssefi AE, Hessane A, Zeroual I, Farhaoui Y. Utilizing Machine Learning and Deep Learning for Predicting Crypto-currency Trends. Salud, Ciencia y Tecnología - Serie de Conferencias 2024; 3:638. https://doi.org/10.56294/sctconf2024638.

Submitted: 20-12-2023 Revised: 13-02-2024 Accepted: 10-03-2024 Published: 11-03-2024

Editor: Dr. William Castillo-González

ABSTRACT

In the dynamic and often volatile world of the cryptocurrency market, accurately predicting future market movements is crucial for making informed trading decisions. While manual trading involves traders making subjective judgments based on market observations, the development of algorithmic trading systems, incorporating Machine Learning and Deep Learning, has introduced a more systematic approach to trading. These systems often employ technical analysis and machine learning techniques to analyze historical price data and generate trading signals. This study delves into a comparative analysis of two charting techniques, Heikin-Ashi and alternate candlestick patterns, in the context of forecasting single-step future price movements of cryptocurrency pairs. Utilizing a range of time windows (1 day, 12 hours, 8 hours, ..., 5 minutes) and various regression algorithms (Huber regressor, k-nearest neighbors regressor, Light Gradient Boosting Machine, linear regression, and random forest regressor), the study evaluates the effectiveness of each technique in forecasting future price movements. The primary outcomes of the research indicate that the application of ensemble learning methods to the alternate candlestick patterns consistently surpasses the performance of Heikin-Ashi candlesticks across all examined time windows. This suggests that alternate candlestick patterns provide more reliable information for predicting short-term price movements. Additionally, the study highlights the varying behavior of Heikin-Ashi candlesticks over different time windows.

Keywords: Technical Analysis; Machine Learning; Deep Learning; Charting Techniques; Cryptocurrency Price Forecasting; Heikin-Ashi Candlesticks.

RESUMEN

En el dinámico y a menudo volátil mundo del mercado de criptomonedas, predecir con precisión los futuros movimientos del mercado es crucial para tomar decisiones de trading informadas. Mientras que la negociación manual implica que los operadores realicen juicios subjetivos basados en observaciones del mercado, el desarrollo de sistemas de negociación algorítmica, que incorporan Machine Learning y Deep Learning, ha introducido un enfoque más sistemático de la negociación. Estos sistemas suelen emplear técnicas de análisis técnico y aprendizaje automático para analizar datos históricos de precios y generar señales de trading. Este estudio profundiza en un análisis comparativo de dos técnicas de gráficos, Heikin-Ashi y patrones de velas alternativos, en el contexto de la previsión de movimientos de precios futuros de un solo paso de pares de criptodivisas. Utilizando una serie de ventanas temporales (1 día, 12 horas, 8 horas, ..., 5 minutos) y varios algoritmos de regresión (Huber regressor, k-nearest neighbors regressor, Light Gradient Boosting Machine, linear regression y random forest regressor), el estudio evalúa la eficacia de cada técnica en la previsión de futuros movimientos de precios. Los principales resultados de la investigación indican que la aplicación de métodos de aprendizaje conjunto a los patrones de velas alternativas supera sistemáticamente el rendimiento de las velas Heikin-Ashi en todas las ventanas temporales examinadas. Esto sugiere que los patrones de velas alternas proporcionan información más fiable para predecir los movimientos de los precios a corto plazo. Además, el estudio destaca el comportamiento variable de las velas Heikin-Ashi en las distintas ventanas temporales.

Palabras clave: Análisis Técnico; Aprendizaje Automático; Aprendizaje Profundo; Técnicas Gráficas; Predicción de Precios de Criptomonedas; Velas Heikin-Ashi.

INTRODUCTION

The high volatility of the cryptocurrency⁽¹⁾promises big returns on investments and exposes traders to high losses if their trading decisions are wrong. The decisions for a trader to make, while trading on an asset are buying, selling or holding (hodling in the cryptocurrency and blockchain communit.^(2,3) These decisions depend on their estimation about how the market will behave in the future. To trade cryptocurrencies, traders use centralized and decentralized exchanges. Centralized exchanges (CEX) function as the predominant platform for the trading of tokens and cryptocurrencies involving various tradable pairs. Within cryptocurrency markets, centralized exchanges (CEXs) integrate infrastructures reminiscent of traditional equities markets, encompassing analogous protocols and rules for trade execution. These aligned features collectively contribute to the facilitation of liquidity provision and the initiation of the price discovery process within these exchange platforms.⁽⁴⁾ Decentralized exchanges (DEXs) are an alternative market structure for traders of crypto assets, relying on smart-contract implementations of automated market makers (AMM). This framework facilitates on-chain trading, offering a distinct approach to crypto asset transactions.⁽⁵⁾ CEXs offers different markets to trade on. Within this study we will be limited to the spot market. In the spot market, traders engage in instant exchanges of assets (tradable pairs). Transactions within the spot market (Referred to as spot trading) are settled immediately, with sellers specifying an ask price, and buyers indicating a bid price. This dynamic market is characterized by real-time exchanges and the interaction of buyers and sellers through order books managed by the CEX.

forecasting the trend of cryptocurrencies' price is a challenging task. traders use a variety of techniques, including technical analysis, fundamental analysis, and machine learning.

Technical analysis, which is commonly employed in forecasting cryptocurrency market trends, is founded on the premise that historical pricing movements and patterns can be utilized to anticipate future trends. It provides objective and data-driven insights into market trends by using technical indicators,⁽⁶⁾ trading rules such as the trading range break-out which is based on support and resistance levels,⁽⁷⁾ and chart patterns,⁽⁸⁾ traders can make decisions regarding selling, hodling or buying a given cryptocurrency. In technical analysis, technical indicators are calculated from the historical price and other transactions data of a cryptocurrency, or a list of aggregated values over a given time window such as Japanese candlestick (referenced to as OHLC values for Open, High, Low and Close) or Heikin-Ashi candlesticks.

Fundamental analysis is an approach to valuing assets by examining their intrinsic worth.⁽⁹⁾ When it comes to forecasting the value of cryptocurrencies, fundamental analysis entails evaluating the underlying elements that power a given coin or cryptocurrency, such as the blockchain technology it relies on and forthcoming project events (such as partnerships, a halving event, the introduction of a new consensus algorithm, etc.). the rate at which the project's services are adopted, the updates the project team delivers, and other project-related details. Fundamental analysts can make well-informed predictions regarding the future price of the coin by examining these variables. However, fundamental analysis can be challenging to apply to cryptocurrencies, due to the high volatility of the cryptocurrency market⁽¹⁰⁾ and can be influenced by factors that are hard to forecast, such as sentiment and speculation.⁽¹¹⁾

Machine learning algorithms have emerged as a promising tool for building cryptocurrency forecasting models (8). These algorithms can analyze vast volumes of data to find trends and connections that could indicate future changes in price. Machine learning algorithms use historical price data aggregated over a period to train models on forecasting a single step or multiple steps values of the price of cryptocurrencies into the future. Datasets used to train these models are mainly OHLC data collected directly from exchanges historical data archives or from sources that aggregate them from different exchanges and sources. The forecasting of future price movements of a cryptocurrency could be treated as a classification or a regression problem. Classification of future price uses a labeled dataset generated from the ohlc data combined with different types of features such as in.⁽¹²⁾ Regression analysis of cryptocurrencies uses regression algorithms to forecast the values of cryptocurrency prices in a specific horizon

Both Heikin-Ashi and Japanese candlesticks are charting techniques used to visualize price changes of an asset, but they can be considered as aggregation techniques, since they are aggregating the prices data over a given time window. The resulting data can be used to forecast future market trends of a cryptocurrency. But which one is better? Japanese candlesticks or Heikin-Ashi? and using which time window? Therefore, we are suggesting that we conduct a comparison of these two techniques as a regression problem using different machine learning algorithms. and over different time windows. To the best of our knowledge no other study has made such a comparison of the two techniques. hence the novelty and the contribution of this work.

The rest of this paper is structured as follows, related works as the second section where we will present some of the recent works related to the use of Japanese and Heikin-Ashi candlesticks using different time windows. The third section will detail the process of data collection and preprocessing alongside with the main concepts and formulas related to it. The fourth section will present the results and discuss the main findings. the final section will conclude this paper and present perspectives of future works related to the theme of this study.

Related works

Several studies have investigated the use of regression algorithms, Japanese candlesticks and Heikin-Ashi for cryptocurrency and stocks price forecasting. Shakri et al.⁽¹³⁾ investigated the effectiveness of various data-driven machine learning (ML) techniques for forecasting bitcoin returns time series data. The data used for forecasting included a comprehensive set of economic and financial indicators as predictors. To evaluate the performance of each ML technique, five statistical indexes were calculated: correlation coefficient, mean absolute error, root mean square error, relative absolute error, and root relative squared error. The results revealed that the Random Forest model outperformed the other ML techniques in terms of predictive accuracy. Mahayana et al.⁽¹⁴⁾ proposes a machine learning-based system for cryptocurrency trading, employing the Cross Industry Standard Process for Data Mining (CRISP-DM) methodology and the Light Gradient Boosting Machine (LGBM) algorithm to forecast the price movement of the BTCUSDT cryptocurrency pair. The proposed system utilizes technical indicators and feature engineering to enhance the predictive accuracy of the LGBM model. While the model outperforms Logistic Regression, its performance still falls short of consistently exceeding the ROI of the Buy and Hold strategy, indicating the difficulty of consistently generating profit in cryptocurrency trading through machine learning approaches. Lahmiri et al.⁽¹⁵⁾ presents a comparative evaluation of various AI systems for forecasting high-frequency bitcoin price series. They employed three distinct sets of models: statistical ML approaches, algorithmic models, and artificial neural network. The authors used Bayesian optimization to determine the optimal parameters values for support vector regressions (SVR) and Gaussian Poisson regressions, and kNN. The results of this study demonstrate that bayesian regularization artificial networks forecasting accuracy and convergence are better than the other used algorithms.

Heikin-Ashi candlesticks are used in literature for different purposes. Madbouly et al.⁽¹⁶⁾ introduces a method for forecasting stock prices by integrating cloud models, fuzzy time series, and Heikin-Ashi candlesticks. The devised model tackles the complexities of nonlinearity, uncertainty, and noise inherent in stock market trends. Leveraging cloud models, it adeptly manages the ambiguity and uncertainty associated with both qualitative (Japanese candlestick patterns) and quantitative (actual stock prices) data. Fuzzy time series captures the dynamic nature of stock prices, incorporating fuzzy logic to represent linguistic concepts. Additionally, Heikin-Ashi candlesticks were employed to filter out stock noise, emphasizing the directional trends in the market. Piasecki et al.⁽¹⁷⁾ consider Heikin-Ashi candlesticks as a variation of Japanese candlesticks that provide a smoother representation of price trends by incorporating elements of both past and present price movements. This transformation can be mathematically represented using oriented fuzzy numbers, which capture the uncertainty inherent in price data. While Heikin-Ashi candlesticks may introduce more imprecision than traditional Japanese candlesticks due to their averaging methodology, they can be effective for identifying trends in noisy price data. The model is evaluated using imperial evaluation and found to have high forecasting accuracy, making it feasible for practical implementation. The use Heikin-Ashi candlesticks isn’t limited to regression problems,

El Youssefi et al.⁽¹⁸⁾ employ the K-Means clustering algorithm to categorize Heikin-Ashi candlesticks and logarithmic returns. The study explores the determination of the optimal number of classes for logarithmic returns associated with four specific cryptocurrencies, extracted from historical spot trading data archived by Binance. The outcomes reveal that the most suitable k-values for the analyzed cryptocurrencies fall within the range of three to five. These findings underscore the significance of clustering as a preprocessing step for classification when addressing the forecasting of cryptocurrency logarithmic returns. This approach proves more beneficial than utilizing a predefined set of classes that represent uptrends, downtrends, and no change in logarithmic returns.

Different time windows for cryptocurrency forecasting are used in different reported studies, To investigate the effectiveness of various machine learning algorithms in forecasting price movements for bitcoin. Akyildirim et al.⁽¹⁹⁾ evaluated the relative forecasting performance of kNN, naïve bayes, logistic regression, random forest, support vector machine and extreme gradient boosting classifiers, across a range of time windows, spanning from 5 to 60 minutes. the findings reveal that the kNN and Random Forest algorithms consistently outperformed other methods in forecasting the value of the target variable across a range of time windows. And Cohen G et al.⁽²⁰⁾ employed commonly used oscillators, namely RSI, MACD, and Keltner Channels, to develop algorithmic trading systems for five popular cryptocurrencies: bitcoin, Ethereum, Binance Coin, Cardano, and XRP. Intraday price data with varying time frames ranging from 5 to 180 minutes is utilized to evaluate the performance of each trading system. The results indicate that longer time frames (60 and 120 minutes) yield superior trading results compared to shorter time frames (5 and 15 minutes).

METHODS

Data Collection and preprocessing

Binance⁽²¹⁾ offer spot market archived historical data of all the trading pairs it has listed within their exchange for download. To forecast bitcoin price, we downloaded the historical data of the bitcoin/USDT trading pair, aggregated as a 1-minute time window for the period from 2017-08-01 to 2023-06-30, the data have been then aggregated to the following time windows: 5mins, 10mins, 15mins, 30mins, 1hour, 2h, 4h, 8h, 12h and 1day). These time windows will be the one we will use in our study.

To understand how this data is generated at the exchange level and how it is aggregated, we will present five main concepts: Exchange tick interval, aggregation time window, Japanese candlesticks, Heikin-Ashi candlesticks and logarithmic returns of price.

Exchange tick interval

Tick interval is the difference in time between two consecutive market updates streamlined by an exchange. For instance, Binance, one of the largest exchanges globally, operates with a tick interval of one second. This implies that every second, Binance sends updates to the current price of a cryptocurrency to reflect its value on their exchange. It is important to note that the tick interval is different from the tick size of a cryptocurrency, the latter means the minimum value a price of an asset can go up or down.

Aggregation time window

An aggregation time window W(t,t^') of size equal to t-t^', is a specific time interval, in terms of seconds, minutes, hours, days or more for which prices are updated. Each aggregation time window W(t,t^' )has n data points ti , where ti is the ith tick. For each ti corresponds a Pi the price at the ith tick. The value of n depends on the tick interval of an exchange. For example, if a cryptocurrency price is being monitored over Binance exchange, each aggregation time window of 1 minute will have an n= 60, which means that within a minute 60 price updates will take place.

The prices of a cryptocurrency over an aggregation time window W_(t,t^' ) could be defined by the following formula:

Japanese Candlesticks

Japanese candlesticks represent a charting method utilized for visualizing the price fluctuations of an asset throughout its historical movement. Each candlestick consists of a rectangular body and two wick-like extensions, known as shadows. The body of the candlestick represents the difference between the closing price C_(W_(t,t^' ) ) (price at the last tick) and the opening price O_(W_(t,t^' ) ) (price at the first tick) for the considered aggregation time window W_(t,t^' ). The shadows represent the highest H_(W_(t,t^' ) ) and lowest prices L_(W_(t,t^' ) ) that were reached during the aggregation time window W_(t,t^' ). The formulas to calculate the four values of each candlestick, within an aggregation time window W_(t,t^' ) that has n ticks, and a list of prices P_(W_(t,t^' ) ) , are as follow:

Japanese candlesticks are color-coded to indicate whether the price closed higher or lower than the open price. A green candlestick (can be represented as a hollow or white candlestick) indicates that the close price exceeded the open price. A red candlestick (which can also be filled or black) indicates that the closure price was lower than the open price. Red and green codes are utilized more frequently than the other two sets of color codes, based on empirical evidence. In the literature Japanese candlesticks are also referred to as OHLC data, such as in.

Heikin-Ashi Candlesticks

Heikin-Ashi candlesticks are a type of candlesticks that is derived from Japanese candlesticks. They are designed to make it easier to visually identify trends in the market (17). For a given aggregation time window W_(t,t^') that has n ticks, and a list of prices P_{(W(t,t^' ) ),} the Heikin-Ashi candlesticks are calculated using the following formulas:

Heikin-Ashi candlesticks are smoother than Japanese candlesticks because they use an average of the previous period's open and close prices to calculate the open price.

Logarithmic returns

While it might seem straightforward to use the simple returns in price forecasting, logarithmic returns of the target feature are used instead (the close price is generally used as the target feature). The reason we use logarithmic returns is due to their characteristics mainly the time additivity and symmetry. The logarithmic returns are widely used in the cryptocurrency price forecasting tasks.^(22,23)

The formula to calculate the logarithmic return using the close price of a cryptocurrency over an aggregation time window W_t,t' is as follows:

For each aggregation time window, we calculated the Heikin-Ashi candlesticks OHLC, and the one step into the future logarithmic return of the close price. The final used datasets are two: for each aggregation time window we combine the ohlc data of Japanese or Heikin-Ashi candlesticks with, the number of trades, the volume of trades (sum of volumes per trade), the body, the upper and bottom shadows of the candlestick and the target feature which consists of the single-step into the future logarithmic return of the close price. PyCaret autoML library⁽²⁴⁾ is used for the rest of the tasks, with the use of timeseries split strategy with 10 folds. the time series split ensures a more realistic evaluation by using only historical data for training and reserving future data for testing. This aligns with the actual nature of cryptocurrency time-based data where we cannot forecast the future based on data that hasn't yet occurred. Missing values are replaced with simple mean-based imputation. Each dataset is splitted into a training split consisting of 57,14 % of data and a test split containing 42,86 %. Data is normalized using z-score to lesser the effect of data outliers on used algorithms.

Machine learning regression algorithms

To conduct our comparison 5 regressors are used: Huber Regressor, kNN Regressor, Light Gradient Boosting Machine, Linear Regression and Random Forest Regressor.

Huber regressor

A Huber regressor is a regression algorithm that uses The Huber loss function which is a hybrid loss function that combines squared loss for samples within a certain threshold (epsilon) and absolute loss for samples beyond a defined threshold . The objective function using the Huber Loss function H_ϵ is defined as follow:

This approach aims to balance robustness to outliers with sensitivity to their influence. In contrast to least squares regression, which penalizes outliers heavily, the Huber loss function penalizes outliers less heavily, thus reducing their impact on the overall regression fit.⁽²⁵⁾

kNN regressor

kNN (K-Nearest Neighbors) regressor is a type of non-parametric, instance-based learning algorithm used in statistics and machine learning. Unlike parametric methods, kNN does not make assumptions about the underlying data distribution and uses the data itself for making predictions. It estimates the value of a continuous variable based on the 'K' nearest neighbors, where 'K' is a user-defined constant. The basic idea is that similar data points (neighbors) will have similar output values.⁽²⁶⁾ The first step in KNN regression is to find the 'K' closest points (neighbors) to the query point. This is typically done using a distance metric like Euclidean distance. The formula for Euclidean distance between two points x and y in a 2-dimensional space is:

Once the 'K' nearest neighbors are identified, the output is the average of the dependent variable for these neighbors. The formula for the prediction is:

Where are the values of the 'K' nearest neighbors.

KNN regressor is a flexible, easy-to-understand algorithm that can be very effective for certain datasets, especially those where the relationship between variables is complex and not easily captured by parametric models. However, its performance depends heavily on the choice of 'K', the distance metric, and the data's dimensionality and scaling.

Light Gradient Boosting Machine

Light Gradient Boosting Machine (LightGBM) is an advanced ensemble machine learning algorithm based on decision trees, optimized for speed and efficiency. Initially, LightGBM creates a model that predicts the mean of the target variable. It then iteratively improves this model by building trees to predict residuals or errors from the current predictions. These residuals are computed as the negative gradient of the loss function, and the model updates by adding a fraction of the new tree's predictions. Key optimizations in LightGBM, like Gradient-based One-Side Sampling (GOSS) and histogram-based tree splitting, focus on processing efficiency and handling large datasets effectively.⁽²⁶⁾

Linear regression

The linear regressor constitutes a basic machine learning model employed to forecast a dependent variable by considering one or more independent variables. The model posits a linear association between the input(s) and the output.⁽²⁷⁾ The formula for a simple linear regression with one independent variable is:

The dependent variable is denoted by Y and the independent variable is denoted by X, β₀ denotes the y-intercept, β₁ reflects the slope of the line (illustrating the change in Y with a one-unit alteration in X), and ϵ represents the error term. In the scenario of multiple linear regression involving numerous independent variables, the formula extends to incorporate these additional factors:

The objective of the model is to identify the optimal linear relationship by minimizing the sum of the squared variances between the observed and predicted values.

Random Forest regressor

Random Forest, an ensemble learning technique predominantly employed for classification and regression purposes, functions by creating numerous decision trees in the training phase. In regression tasks, the collective prediction is obtained by averaging the individual tree predictions. The fundamental concept revolves around amalgamating forecasts from multiple tree models, thereby enhancing overall performance and mitigating overfitting.⁽²⁸⁾ The formula for a Random Forest model's prediction is the average of the predictions from all the individual trees:

Where is the predicted output, is the number of trees in the forest, represents the prediction from the tree, and are the input features.

Evaluation metrics

R²

R² is a useful metric for assessing the goodness of fit of a regression model. It provides insights into how well the model's predictions align with the actual observed values. A higher R² value suggests a better fit, indicating that the model captures a larger proportion of the variability in the target variable.

RMSLE

Root Mean Squared Logarithmic Error (RMSLE) is calculated by applying the natural logarithm to both the actual and predicted values and then taking the root mean square error of the differences. RMSLE is less sensitive to outliers than other error metrics, such as mean squared error, because it reduces the impact of large errors. Additionally, RMSLE penalizes underestimation more heavily than overestimation, making it a suitable choice for situations where underestimation is more costly than overestimation.⁽²⁹⁾

We used Root Mean Squared Logarithmic Error (RMSLE) and R-squared because they are advantageous for evaluating time series forecasting models in specific contexts. RMSLE is useful for its sensitivity to relative errors rather than absolute ones, making it ideal for data with a wide range of values. It also penalizes underestimations less than overestimations and is less sensitive to outliers due to its logarithmic nature R-squared, on the other hand, measures how much variance in the dependent variable is explained by the model, offering an intuitive understanding of model performance. It's also scale-independent, allowing for comparisons across different datasets or scales.^{(29,30,31,32,33,34)} These metrics provide benefits over traditional ones like MAE or RMSE by offering better interpretability, being more suited to certain types of data (like those with non-linear relationships or uneven variance) and offering robustness against specific types of errors.^{(35,36,37,38,39,40,41)}

RESULTS AND DISCUSSION

Table 1 represents the detailed results of all time windows and all the regression algorithms results that have been used for both Japanese and Heikin-Ashi candlesticks. To visualize the results, figures from figure 1. To figure 10 suggests a side-by-side comparison of the R²results of Japanese and Heikin-Ashi candlesticks for each time window. The abbreviations used within the chart to represents different algorithms are HR : Huber Regressor, KNR : kNN Regressor, LGBM : Light Gradient Boosting Machine, LR: Linear Regression and RFR: Random Forest Regressor. The reporting order of the results was based on alphabetical ordering of the algorithms.

A graph of different colored bars

Description automatically generated

Figure 1. R-squared results of different algorithms applied to Japanese and Heikin-Ashi candlesticks using 1 day time window