Get trending papers in your email inbox once a day!
Get trending papers in your email inbox!
SubscribeMultilingual Arbitrage: Optimizing Data Pools to Accelerate Multilingual Progress
The use of synthetic data has played a critical role in recent state-of-art breakthroughs. However, overly relying on a single oracle teacher model to generate data has been shown to lead to model collapse and invite propagation of biases. These limitations are particularly evident in multilingual settings, where the absence of a universally effective teacher model that excels across all languages presents significant challenges. In this work, we address these extreme difference by introducing "multilingual arbitrage", which capitalizes on performance variations between multiple models for a given language. To do so, we strategically route samples through a diverse pool of models, each with unique strengths in different languages. Across exhaustive experiments on state-of-art models, our work suggests that arbitrage techniques allow for spectacular gains in performance that far outperform relying on a single teacher. In particular, compared to the best single teacher, we observe gains of up to 56.5% improvement in win rates averaged across all languages when switching to multilingual arbitrage. We observe the most significant gains for the least resourced languages in our pool.
Fundamentals of Perpetual Futures
Perpetual futures are the most popular cryptocurrency derivatives. Perpetuals offer leveraged exposure to their underlying without rollover or direct ownership. Unlike fixed-maturity futures, perpetuals are not guaranteed to converge to the spot price. To minimize the gap between perpetual and spot prices, long investors periodically pay shorts a funding rate proportional to this difference. We derive no-arbitrage prices for perpetual futures in frictionless markets and bounds in markets with trading costs. Empirically, deviations from these prices in crypto are larger than in traditional currency markets, comove across currencies, and diminish over time. An implied arbitrage strategy yields high Sharpe ratios.
Harnessing Deep Q-Learning for Enhanced Statistical Arbitrage in High-Frequency Trading: A Comprehensive Exploration
The realm of High-Frequency Trading (HFT) is characterized by rapid decision-making processes that capitalize on fleeting market inefficiencies. As the financial markets become increasingly competitive, there is a pressing need for innovative strategies that can adapt and evolve with changing market dynamics. Enter Reinforcement Learning (RL), a branch of machine learning where agents learn by interacting with their environment, making it an intriguing candidate for HFT applications. This paper dives deep into the integration of RL in statistical arbitrage strategies tailored for HFT scenarios. By leveraging the adaptive learning capabilities of RL, we explore its potential to unearth patterns and devise trading strategies that traditional methods might overlook. We delve into the intricate exploration-exploitation trade-offs inherent in RL and how they manifest in the volatile world of HFT. Furthermore, we confront the challenges of applying RL in non-stationary environments, typical of financial markets, and investigate methodologies to mitigate associated risks. Through extensive simulations and backtests, our research reveals that RL not only enhances the adaptability of trading strategies but also shows promise in improving profitability metrics and risk-adjusted returns. This paper, therefore, positions RL as a pivotal tool for the next generation of HFT-based statistical arbitrage, offering insights for both researchers and practitioners in the field.
IPProtect: protecting the intellectual property of visual datasets during data valuation
Data trading is essential to accelerate the development of data-driven machine learning pipelines. The central problem in data trading is to estimate the utility of a seller's dataset with respect to a given buyer's machine learning task, also known as data valuation. Typically, data valuation requires one or more participants to share their raw dataset with others, leading to potential risks of intellectual property (IP) violations. In this paper, we tackle the novel task of preemptively protecting the IP of datasets that need to be shared during data valuation. First, we identify and formalize two kinds of novel IP risks in visual datasets: data-item (image) IP and statistical (dataset) IP. Then, we propose a novel algorithm to convert the raw dataset into a sanitized version, that provides resistance to IP violations, while at the same time allowing accurate data valuation. The key idea is to limit the transfer of information from the raw dataset to the sanitized dataset, thereby protecting against potential intellectual property violations. Next, we analyze our method for the likely existence of a solution and immunity against reconstruction attacks. Finally, we conduct extensive experiments on three computer vision datasets demonstrating the advantages of our method in comparison to other baselines.
A Stochastic Thermodynamics Approach to Price Impact and Round-Trip Arbitrage: Theory and Empirical Implications
This paper develops a comprehensive theoretical framework that imports concepts from stochastic thermodynamics to model price impact and characterize the feasibility of round-trip arbitrage in financial markets. A trading cycle is treated as a non-equilibrium thermodynamic process, where price impact represents dissipative work and market noise plays the role of thermal fluctuations. The paper proves a Financial Second Law: under general convex impact functionals, any round-trip trading strategy yields non-positive expected profit. This structural constraint is complemented by a fluctuation theorem that bounds the probability of profitable cycles in terms of dissipated work and market volatility. The framework introduces a statistical ensemble of trading strategies governed by a Gibbs measure, leading to a free energy decomposition that connects expected cost, strategy entropy, and a market temperature parameter. The framework provides rigorous, testable inequalities linking microstructural impact to macroscopic no-arbitrage conditions, offering a novel physics-inspired perspective on market efficiency. The paper derives explicit analytical results for prototypical trading strategies and discusses empirical validation protocols.
Binary Tree Option Pricing Under Market Microstructure Effects: A Random Forest Approach
We propose a machine learning-based extension of the classical binomial option pricing model that incorporates key market microstructure effects. Traditional models assume frictionless markets, overlooking empirical features such as bid-ask spreads, discrete price movements, and serial return correlations. Our framework augments the binomial tree with path-dependent transition probabilities estimated via Random Forest classifiers trained on high-frequency market data. This approach preserves no-arbitrage conditions while embedding real-world trading dynamics into the pricing model. Using 46,655 minute-level observations of SPY from January to June 2025, we achieve an AUC of 88.25% in forecasting one-step price movements. Order flow imbalance is identified as the most influential predictor, contributing 43.2% to feature importance. After resolving time-scaling inconsistencies in tree construction, our model yields option prices that deviate by 13.79% from Black-Scholes benchmarks, highlighting the impact of microstructure on fair value estimation. While computational limitations restrict the model to short-term derivatives, our results offer a robust, data-driven alternative to classical pricing methods grounded in empirical market behavior.
FAR-Trans: An Investment Dataset for Financial Asset Recommendation
Financial asset recommendation (FAR) is a sub-domain of recommender systems which identifies useful financial securities for investors, with the expectation that they will invest capital on the recommended assets. FAR solutions analyse and learn from multiple data sources, including time series pricing data, customer profile information and expectations, as well as past investments. However, most models have been developed over proprietary datasets, making a comparison over a common benchmark impossible. In this paper, we aim to solve this problem by introducing FAR-Trans, the first public dataset for FAR, containing pricing information and retail investor transactions acquired from a large European financial institution. We also provide a bench-marking comparison between eleven FAR algorithms over the data for use as future baselines. The dataset can be downloaded from https://doi.org/10.5525/gla.researchdata.1658 .
Universal features of price formation in financial markets: perspectives from Deep Learning
Using a large-scale Deep Learning approach applied to a high-frequency database containing billions of electronic market quotes and transactions for US equities, we uncover nonparametric evidence for the existence of a universal and stationary price formation mechanism relating the dynamics of supply and demand for a stock, as revealed through the order book, to subsequent variations in its market price. We assess the model by testing its out-of-sample predictions for the direction of price moves given the history of price and order flow, across a wide range of stocks and time periods. The universal price formation model is shown to exhibit a remarkably stable out-of-sample prediction accuracy across time, for a wide range of stocks from different sectors. Interestingly, these results also hold for stocks which are not part of the training sample, showing that the relations captured by the model are universal and not asset-specific. The universal model --- trained on data from all stocks --- outperforms, in terms of out-of-sample prediction accuracy, asset-specific linear and nonlinear models trained on time series of any given stock, showing that the universal nature of price formation weighs in favour of pooling together financial data from various stocks, rather than designing asset- or sector-specific models as commonly done. Standard data normalizations based on volatility, price level or average spread, or partitioning the training data into sectors or categories such as large/small tick stocks, do not improve training results. On the other hand, inclusion of price and order flow history over many past observations is shown to improve forecasting performance, showing evidence of path-dependence in price dynamics.
Stock Market Prediction using Natural Language Processing -- A Survey
The stock market is a network which provides a platform for almost all major economic transactions. While investing in the stock market is a good idea, investing in individual stocks may not be, especially for the casual investor. Smart stock-picking requires in-depth research and plenty of dedication. Predicting this stock value offers enormous arbitrage profit opportunities. This attractiveness of finding a solution has prompted researchers to find a way past problems like volatility, seasonality, and dependence on time. This paper surveys recent literature in the domain of natural language processing and machine learning techniques used to predict stock market movements. The main contributions of this paper include the sophisticated categorizations of many recent articles and the illustration of the recent trends of research in stock market prediction and its related areas.
Proof-of-Contribution-Based Design for Collaborative Machine Learning on Blockchain
We consider a project (model) owner that would like to train a model by utilizing the local private data and compute power of interested data owners, i.e., trainers. Our goal is to design a data marketplace for such decentralized collaborative/federated learning applications that simultaneously provides i) proof-of-contribution based reward allocation so that the trainers are compensated based on their contributions to the trained model; ii) privacy-preserving decentralized model training by avoiding any data movement from data owners; iii) robustness against malicious parties (e.g., trainers aiming to poison the model); iv) verifiability in the sense that the integrity, i.e., correctness, of all computations in the data market protocol including contribution assessment and outlier detection are verifiable through zero-knowledge proofs; and v) efficient and universal design. We propose a blockchain-based marketplace design to achieve all five objectives mentioned above. In our design, we utilize a distributed storage infrastructure and an aggregator aside from the project owner and the trainers. The aggregator is a processing node that performs certain computations, including assessing trainer contributions, removing outliers, and updating hyper-parameters. We execute the proposed data market through a blockchain smart contract. The deployed smart contract ensures that the project owner cannot evade payment, and honest trainers are rewarded based on their contributions at the end of training. Finally, we implement the building blocks of the proposed data market and demonstrate their applicability in practical scenarios through extensive experiments.
Nash Equilibrium between Brokers and Traders
We study the perfect information Nash equilibrium between a broker and her clients -- an informed trader and an uniformed trader. In our model, the broker trades in the lit exchange where trades have instantaneous and transient price impact with exponential resilience, while both clients trade with the broker. The informed trader and the broker maximise expected wealth subject to inventory penalties, while the uninformed trader is not strategic and sends the broker random buy and sell orders. We characterise the Nash equilibrium of the trading strategies with the solution to a coupled system of forward-backward stochastic differential equations (FBSDEs). We solve this system explicitly and study the effect of information, profitability, and inventory control in the trading strategies of the broker and the informed trader.
Continuous Risk Factor Models: Analyzing Asset Correlations through Energy Distance
This paper introduces a novel approach to financial risk analysis that does not rely on traditional price and market data, instead using market news to model assets as distributions over a metric space of risk factors. By representing asset returns as integrals over the scalar field of these risk factors, we derive the covariance structure between asset returns. Utilizing encoder-only language models to embed this news data, we explore the relationships between asset return distributions through the concept of Energy Distance, establishing connections between distributional differences and excess returns co-movements. This data-agnostic approach provides new insights into portfolio diversification, risk management, and the construction of hedging strategies. Our findings have significant implications for both theoretical finance and practical risk management, offering a more robust framework for modelling complex financial systems without depending on conventional market data.
Intelligent Trading Systems: A Sentiment-Aware Reinforcement Learning Approach
The feasibility of making profitable trades on a single asset on stock exchanges based on patterns identification has long attracted researchers. Reinforcement Learning (RL) and Natural Language Processing have gained notoriety in these single-asset trading tasks, but only a few works have explored their combination. Moreover, some issues are still not addressed, such as extracting market sentiment momentum through the explicit capture of sentiment features that reflect the market condition over time and assessing the consistency and stability of RL results in different situations. Filling this gap, we propose the Sentiment-Aware RL (SentARL) intelligent trading system that improves profit stability by leveraging market mood through an adaptive amount of past sentiment features drawn from textual news. We evaluated SentARL across twenty assets, two transaction costs, and five different periods and initializations to show its consistent effectiveness against baselines. Subsequently, this thorough assessment allowed us to identify the boundary between news coverage and market sentiment regarding the correlation of price-time series above which SentARL's effectiveness is outstanding.
PreBit -- A multimodal model with Twitter FinBERT embeddings for extreme price movement prediction of Bitcoin
Bitcoin, with its ever-growing popularity, has demonstrated extreme price volatility since its origin. This volatility, together with its decentralised nature, make Bitcoin highly subjective to speculative trading as compared to more traditional assets. In this paper, we propose a multimodal model for predicting extreme price fluctuations. This model takes as input a variety of correlated assets, technical indicators, as well as Twitter content. In an in-depth study, we explore whether social media discussions from the general public on Bitcoin have predictive power for extreme price movements. A dataset of 5,000 tweets per day containing the keyword `Bitcoin' was collected from 2015 to 2021. This dataset, called PreBit, is made available online. In our hybrid model, we use sentence-level FinBERT embeddings, pretrained on financial lexicons, so as to capture the full contents of the tweets and feed it to the model in an understandable way. By combining these embeddings with a Convolutional Neural Network, we built a predictive model for significant market movements. The final multimodal ensemble model includes this NLP model together with a model based on candlestick data, technical indicators and correlated asset prices. In an ablation study, we explore the contribution of the individual modalities. Finally, we propose and backtest a trading strategy based on the predictions of our models with varying prediction threshold and show that it can used to build a profitable trading strategy with a reduced risk over a `hold' or moving average strategy.
Empirical Study of Market Impact Conditional on Order-Flow Imbalance
In this research, we have empirically investigated the key drivers affecting liquidity in equity markets. We illustrated how theoretical models, such as Kyle's model, of agents' interplay in the financial markets, are aligned with the phenomena observed in publicly available trades and quotes data. Specifically, we confirmed that for small signed order-flows, the price impact grows linearly with increase in the order-flow imbalance. We have, further, implemented a machine learning algorithm to forecast market impact given a signed order-flow. Our findings suggest that machine learning models can be used in estimation of financial variables; and predictive accuracy of such learning algorithms can surpass the performance of traditional statistical approaches. Understanding the determinants of price impact is crucial for several reasons. From a theoretical stance, modelling the impact provides a statistical measure of liquidity. Practitioners adopt impact models as a pre-trade tool to estimate expected transaction costs and optimize the execution of their strategies. This further serves as a post-trade valuation benchmark as suboptimal execution can significantly deteriorate a portfolio performance. More broadly, the price impact reflects the balance of liquidity across markets. This is of central importance to regulators as it provides an all-encompassing explanation of the correlation between market design and systemic risk, enabling regulators to design more stable and efficient markets.
Reinforcement-Learning Portfolio Allocation with Dynamic Embedding of Market Information
We develop a portfolio allocation framework that leverages deep learning techniques to address challenges arising from high-dimensional, non-stationary, and low-signal-to-noise market information. Our approach includes a dynamic embedding method that reduces the non-stationary, high-dimensional state space into a lower-dimensional representation. We design a reinforcement learning (RL) framework that integrates generative autoencoders and online meta-learning to dynamically embed market information, enabling the RL agent to focus on the most impactful parts of the state space for portfolio allocation decisions. Empirical analysis based on the top 500 U.S. stocks demonstrates that our framework outperforms common portfolio benchmarks and the predict-then-optimize (PTO) approach using machine learning, particularly during periods of market stress. Traditional factor models do not fully explain this superior performance. The framework's ability to time volatility reduces its market exposure during turbulent times. Ablation studies confirm the robustness of this performance across various reinforcement learning algorithms. Additionally, the embedding and meta-learning techniques effectively manage the complexities of high-dimensional, noisy, and non-stationary financial data, enhancing both portfolio performance and risk management.
Learning to Predict Short-Term Volatility with Order Flow Image Representation
Introduction: The paper addresses the challenging problem of predicting the short-term realized volatility of the Bitcoin price using order flow information. The inherent stochastic nature and anti-persistence of price pose difficulties in accurate prediction. Methods: To address this, we propose a method that transforms order flow data over a fixed time interval (snapshots) into images. The order flow includes trade sizes, trade directions, and limit order book, and is mapped into image colour channels. These images are then used to train both a simple 3-layer Convolutional Neural Network (CNN) and more advanced ResNet-18 and ConvMixer, with additionally supplementing them with hand-crafted features. The models are evaluated against classical GARCH, Multilayer Perceptron trained on raw data, and a naive guess method that considers current volatility as a prediction. Results: The experiments are conducted using price data from January 2021 and evaluate model performance in terms of root mean square error (RMSPE). The results show that our order flow representation with a CNN as a predictive model achieves the best performance, with an RMSPE of 0.85+/-1.1 for the model with aggregated features and 1.0+/-1.4 for the model without feature supplementation. ConvMixer with feature supplementation follows closely. In comparison, the RMSPE for the naive guess method was 1.4+/-3.0.
LiveTradeBench: Seeking Real-World Alpha with Large Language Models
Large language models (LLMs) achieve strong performance across benchmarks--from knowledge quizzes and math reasoning to web-agent tasks--but these tests occur in static settings, lacking real dynamics and uncertainty. Consequently, they evaluate isolated reasoning or problem-solving rather than decision-making under uncertainty. To address this, we introduce LiveTradeBench, a live trading environment for evaluating LLM agents in realistic and evolving markets. LiveTradeBench follows three design principles: (i) Live data streaming of market prices and news, eliminating dependence on offline backtesting and preventing information leakage while capturing real-time uncertainty; (ii) a portfolio-management abstraction that extends control from single-asset actions to multi-asset allocation, integrating risk management and cross-asset reasoning; and (iii) multi-market evaluation across structurally distinct environments--U.S. stocks and Polymarket prediction markets--differing in volatility, liquidity, and information flow. At each step, an agent observes prices, news, and its portfolio, then outputs percentage allocations that balance risk and return. Using LiveTradeBench, we run 50-day live evaluations of 21 LLMs across families. Results show that (1) high LMArena scores do not imply superior trading outcomes; (2) models display distinct portfolio styles reflecting risk appetite and reasoning dynamics; and (3) some LLMs effectively leverage live signals to adapt decisions. These findings expose a gap between static evaluation and real-world competence, motivating benchmarks that test sequential decision making and consistency under live uncertainty.
Risk-sensitive Reinforcement Learning Based on Convex Scoring Functions
We propose a reinforcement learning (RL) framework under a broad class of risk objectives, characterized by convex scoring functions. This class covers many common risk measures, such as variance, Expected Shortfall, entropic Value-at-Risk, and mean-risk utility. To resolve the time-inconsistency issue, we consider an augmented state space and an auxiliary variable and recast the problem as a two-state optimization problem. We propose a customized Actor-Critic algorithm and establish some theoretical approximation guarantees. A key theoretical contribution is that our results do not require the Markov decision process to be continuous. Additionally, we propose an auxiliary variable sampling method inspired by the alternating minimization algorithm, which is convergent under certain conditions. We validate our approach in simulation experiments with a financial application in statistical arbitrage trading, demonstrating the effectiveness of the algorithm.
TradingGroup: A Multi-Agent Trading System with Self-Reflection and Data-Synthesis
Recent advancements in large language models (LLMs) have enabled powerful agent-based applications in finance, particularly for sentiment analysis, financial report comprehension, and stock forecasting. However, existing systems often lack inter-agent coordination, structured self-reflection, and access to high-quality, domain-specific post-training data such as data from trading activities including both market conditions and agent decisions. These data are crucial for agents to understand the market dynamics, improve the quality of decision-making and promote effective coordination. We introduce TradingGroup, a multi-agent trading system designed to address these limitations through a self-reflective architecture and an end-to-end data-synthesis pipeline. TradingGroup consists of specialized agents for news sentiment analysis, financial report interpretation, stock trend forecasting, trading style adaptation, and a trading decision making agent that merges all signals and style preferences to produce buy, sell or hold decisions. Specifically, we design self-reflection mechanisms for the stock forecasting, style, and decision-making agents to distill past successes and failures for similar reasoning in analogous future scenarios and a dynamic risk-management model to offer configurable dynamic stop-loss and take-profit mechanisms. In addition, TradingGroup embeds an automated data-synthesis and annotation pipeline that generates high-quality post-training data for further improving the agent performance through post-training. Our backtesting experiments across five real-world stock datasets demonstrate TradingGroup's superior performance over rule-based, machine learning, reinforcement learning, and existing LLM-based trading strategies.
Hedging Properties of Algorithmic Investment Strategies using Long Short-Term Memory and Time Series models for Equity Indices
This paper proposes a novel approach to hedging portfolios of risky assets when financial markets are affected by financial turmoils. We introduce a completely novel approach to diversification activity not on the level of single assets but on the level of ensemble algorithmic investment strategies (AIS) built based on the prices of these assets. We employ four types of diverse theoretical models (LSTM - Long Short-Term Memory, ARIMA-GARCH - Autoregressive Integrated Moving Average - Generalized Autoregressive Conditional Heteroskedasticity, momentum, and contrarian) to generate price forecasts, which are then used to produce investment signals in single and complex AIS. In such a way, we are able to verify the diversification potential of different types of investment strategies consisting of various assets (energy commodities, precious metals, cryptocurrencies, or soft commodities) in hedging ensemble AIS built for equity indices (S&P 500 index). Empirical data used in this study cover the period between 2004 and 2022. Our main conclusion is that LSTM-based strategies outperform the other models and that the best diversifier for the AIS built for the S&P 500 index is the AIS built for Bitcoin. Finally, we test the LSTM model for a higher frequency of data (1 hour). We conclude that it outperforms the results obtained using daily data.
A Multimodal Foundation Agent for Financial Trading: Tool-Augmented, Diversified, and Generalist
Financial trading is a crucial component of the markets, informed by a multimodal information landscape encompassing news, prices, and Kline charts, and encompasses diverse tasks such as quantitative trading and high-frequency trading with various assets. While advanced AI techniques like deep learning and reinforcement learning are extensively utilized in finance, their application in financial trading tasks often faces challenges due to inadequate handling of multimodal data and limited generalizability across various tasks. To address these challenges, we present FinAgent, a multimodal foundational agent with tool augmentation for financial trading. FinAgent's market intelligence module processes a diverse range of data-numerical, textual, and visual-to accurately analyze the financial market. Its unique dual-level reflection module not only enables rapid adaptation to market dynamics but also incorporates a diversified memory retrieval system, enhancing the agent's ability to learn from historical data and improve decision-making processes. The agent's emphasis on reasoning for actions fosters trust in its financial decisions. Moreover, FinAgent integrates established trading strategies and expert insights, ensuring that its trading approaches are both data-driven and rooted in sound financial principles. With comprehensive experiments on 6 financial datasets, including stocks and Crypto, FinAgent significantly outperforms 9 state-of-the-art baselines in terms of 6 financial metrics with over 36% average improvement on profit. Specifically, a 92.27% return (a 84.39% relative improvement) is achieved on one dataset. Notably, FinAgent is the first advanced multimodal foundation agent designed for financial trading tasks.
StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?
Large language models (LLMs) have recently demonstrated strong capabilities as autonomous agents, showing promise in reasoning, tool use, and sequential decision-making. While prior benchmarks have evaluated LLM agents in domains such as software engineering and scientific discovery, the finance domain remains underexplored, despite its direct relevance to economic value and high-stakes decision-making. Existing financial benchmarks primarily test static knowledge through question answering, but they fall short of capturing the dynamic and iterative nature of trading. To address this gap, we introduce StockBench, a contamination-free benchmark designed to evaluate LLM agents in realistic, multi-month stock trading environments. Agents receive daily market signals -- including prices, fundamentals, and news -- and must make sequential buy, sell, or hold decisions. Performance is assessed using financial metrics such as cumulative return, maximum drawdown, and the Sortino ratio. Our evaluation of state-of-the-art proprietary (e.g., GPT-5, Claude-4) and open-weight (e.g., Qwen3, Kimi-K2, GLM-4.5) models shows that while most LLM agents struggle to outperform the simple buy-and-hold baseline, several models demonstrate the potential to deliver higher returns and manage risk more effectively. These findings highlight both the challenges and opportunities in developing LLM-powered financial agents, showing that excelling at static financial knowledge tasks does not necessarily translate into successful trading strategies. We release StockBench as an open-source resource to support reproducibility and advance future research in this domain.
Deep Reinforcement Learning for Quantitative Trading
Artificial Intelligence (AI) and Machine Learning (ML) are transforming the domain of Quantitative Trading (QT) through the deployment of advanced algorithms capable of sifting through extensive financial datasets to pinpoint lucrative investment openings. AI-driven models, particularly those employing ML techniques such as deep learning and reinforcement learning, have shown great prowess in predicting market trends and executing trades at a speed and accuracy that far surpass human capabilities. Its capacity to automate critical tasks, such as discerning market conditions and executing trading strategies, has been pivotal. However, persistent challenges exist in current QT methods, especially in effectively handling noisy and high-frequency financial data. Striking a balance between exploration and exploitation poses another challenge for AI-driven trading agents. To surmount these hurdles, our proposed solution, QTNet, introduces an adaptive trading model that autonomously formulates QT strategies through an intelligent trading agent. Incorporating deep reinforcement learning (DRL) with imitative learning methodologies, we bolster the proficiency of our model. To tackle the challenges posed by volatile financial datasets, we conceptualize the QT mechanism within the framework of a Partially Observable Markov Decision Process (POMDP). Moreover, by embedding imitative learning, the model can capitalize on traditional trading tactics, nurturing a balanced synergy between discovery and utilization. For a more realistic simulation, our trading agent undergoes training using minute-frequency data sourced from the live financial market. Experimental findings underscore the model's proficiency in extracting robust market features and its adaptability to diverse market conditions.
Trading-R1: Financial Trading with LLM Reasoning via Reinforcement Learning
Developing professional, structured reasoning on par with human financial analysts and traders remains a central challenge in AI for finance, where markets demand interpretability and trust. Traditional time-series models lack explainability, while LLMs face challenges in turning natural-language analysis into disciplined, executable trades. Although reasoning LLMs have advanced in step-by-step planning and verification, their application to risk-sensitive financial decisions is underexplored. We present Trading-R1, a financially-aware model that incorporates strategic thinking and planning for comprehensive thesis composition, facts-grounded analysis, and volatility-adjusted decision making. Trading-R1 aligns reasoning with trading principles through supervised fine-tuning and reinforcement learning with a three-stage easy-to-hard curriculum. Training uses Tauric-TR1-DB, a 100k-sample corpus spanning 18 months, 14 equities, and five heterogeneous financial data sources. Evaluated on six major equities and ETFs, Trading-R1 demonstrates improved risk-adjusted returns and lower drawdowns compared to both open-source and proprietary instruction-following models as well as reasoning models. The system generates structured, evidence-based investment theses that support disciplined and interpretable trading decisions. Trading-R1 Terminal will be released at https://github.com/TauricResearch/Trading-R1.
Research on Optimizing Real-Time Data Processing in High-Frequency Trading Algorithms using Machine Learning
High-frequency trading (HFT) represents a pivotal and intensely competitive domain within the financial markets. The velocity and accuracy of data processing exert a direct influence on profitability, underscoring the significance of this field. The objective of this work is to optimise the real-time processing of data in high-frequency trading algorithms. The dynamic feature selection mechanism is responsible for monitoring and analysing market data in real time through clustering and feature weight analysis, with the objective of automatically selecting the most relevant features. This process employs an adaptive feature extraction method, which enables the system to respond and adjust its feature set in a timely manner when the data input changes, thus ensuring the efficient utilisation of data. The lightweight neural networks are designed in a modular fashion, comprising fast convolutional layers and pruning techniques that facilitate the expeditious completion of data processing and output prediction. In contrast to conventional deep learning models, the neural network architecture has been specifically designed to minimise the number of parameters and computational complexity, thereby markedly reducing the inference time. The experimental results demonstrate that the model is capable of maintaining consistent performance in the context of varying market conditions, thereby illustrating its advantages in terms of processing speed and revenue enhancement.
BitTensor: A Peer-to-Peer Intelligence Market
As with other commodities, markets could help us efficiently produce machine intelligence. We propose a market where intelligence is priced by other intelligence systems peer-to-peer across the internet. Peers rank each other by training neural networks which learn the value of their neighbors. Scores accumulate on a digital ledger where high ranking peers are monetarily rewarded with additional weight in the network. However, this form of peer-ranking is not resistant to collusion, which could disrupt the accuracy of the mechanism. The solution is a connectivity-based regularization which exponentially rewards trusted peers, making the system resistant to collusion of up to 50 percent of the network weight. The result is a collectively run intelligence market which continual produces newly trained models and pays contributors who create information theoretic value.
Realised Volatility Forecasting: Machine Learning via Financial Word Embedding
This study develops FinText, a financial word embedding compiled from 15 years of business news archives. The results show that FinText produces substantially more accurate results than general word embeddings based on the gold-standard financial benchmark we introduced. In contrast to well-known econometric models, and over the sample period from 27 July 2007 to 27 January 2022 for 23 NASDAQ stocks, using stock-related news, our simple natural language processing model supported by different word embeddings improves realised volatility forecasts on high volatility days. This improvement in realised volatility forecasting performance switches to normal volatility days when general hot news is used. By utilising SHAP, an Explainable AI method, we also identify and classify key phrases in stock-related and general hot news that moved volatility.
Distributional Reinforcement Learning-based Energy Arbitrage Strategies in Imbalance Settlement Mechanism
Growth in the penetration of renewable energy sources makes supply more uncertain and leads to an increase in the system imbalance. This trend, together with the single imbalance pricing, opens an opportunity for balance responsible parties (BRPs) to perform energy arbitrage in the imbalance settlement mechanism. To this end, we propose a battery control framework based on distributional reinforcement learning (DRL). Our proposed control framework takes a risk-sensitive perspective, allowing BRPs to adjust their risk preferences: we aim to optimize a weighted sum of the arbitrage profit and a risk measure while constraining the daily number of cycles for the battery. We assess the performance of our proposed control framework using the Belgian imbalance prices of 2022 and compare two state-of-the-art RL methods, deep Q learning and soft actor-critic. Results reveal that the distributional soft actor-critic method can outperform other methods. Moreover, we note that our fully risk-averse agent appropriately learns to hedge against the risk related to the unknown imbalance price by (dis)charging the battery only when the agent is more certain about the price.
TRADES: Generating Realistic Market Simulations with Diffusion Models
Financial markets are complex systems characterized by high statistical noise, nonlinearity, and constant evolution. Thus, modeling them is extremely hard. We address the task of generating realistic and responsive Limit Order Book (LOB) market simulations, which are fundamental for calibrating and testing trading strategies, performing market impact experiments, and generating synthetic market data. Previous works lack realism, usefulness, and responsiveness of the generated simulations. To bridge this gap, we propose a novel TRAnsformer-based Denoising Diffusion Probabilistic Engine for LOB Simulations (TRADES). TRADES generates realistic order flows conditioned on the state of the market, leveraging a transformer-based architecture that captures the temporal and spatial characteristics of high-frequency market data. There is a notable absence of quantitative metrics for evaluating generative market simulation models in the literature. To tackle this problem, we adapt the predictive score, a metric measured as an MAE, by training a stock price predictive model on synthetic data and testing it on real data. We compare TRADES with previous works on two stocks, reporting an x3.27 and x3.47 improvement over SoTA according to the predictive score, demonstrating that we generate useful synthetic market data for financial downstream tasks. We assess TRADES's market simulation realism and responsiveness, showing that it effectively learns the conditional data distribution and successfully reacts to an experimental agent, giving sprout to possible calibrations and evaluations of trading strategies and market impact experiments. We developed DeepMarket, the first open-source Python framework for market simulation with deep learning. Our repository includes a synthetic LOB dataset composed of TRADES's generates simulations. We release the code at github.com/LeonardoBerti00/DeepMarket.
Extending Deep Reinforcement Learning Frameworks in Cryptocurrency Market Making
There has been a recent surge in interest in the application of artificial intelligence to automated trading. Reinforcement learning has been applied to single- and multi-instrument use cases, such as market making or portfolio management. This paper proposes a new approach to framing cryptocurrency market making as a reinforcement learning challenge by introducing an event-based environment wherein an event is defined as a change in price greater or less than a given threshold, as opposed to by tick or time-based events (e.g., every minute, hour, day, etc.). Two policy-based agents are trained to learn a market making trading strategy using eight days of training data and evaluate their performance using 30 days of testing data. Limit order book data recorded from Bitmex exchange is used to validate this approach, which demonstrates improved profit and stability compared to a time-based approach for both agents when using a simple multi-layer perceptron neural network for function approximation and seven different reward functions.
TraderTalk: An LLM Behavioural ABM applied to Simulating Human Bilateral Trading Interactions
We introduce a novel hybrid approach that augments Agent-Based Models (ABMs) with behaviors generated by Large Language Models (LLMs) to simulate human trading interactions. We call our model TraderTalk. Leveraging LLMs trained on extensive human-authored text, we capture detailed and nuanced representations of bilateral conversations in financial trading. Applying this Generative Agent-Based Model (GABM) to government bond markets, we replicate trading decisions between two stylised virtual humans. Our method addresses both structural challenges, such as coordinating turn-taking between realistic LLM-based agents, and design challenges, including the interpretation of LLM outputs by the agent model. By exploring prompt design opportunistically rather than systematically, we enhance the realism of agent interactions without exhaustive overfitting or model reliance. Our approach successfully replicates trade-to-order volume ratios observed in related asset markets, demonstrating the potential of LLM-augmented ABMs in financial simulations
Quantitative Risk Management in Volatile Markets with an Expectile-Based Framework for the FTSE Index
This research presents a framework for quantitative risk management in volatile markets, specifically focusing on expectile-based methodologies applied to the FTSE 100 index. Traditional risk measures such as Value-at-Risk (VaR) have demonstrated significant limitations during periods of market stress, as evidenced during the 2008 financial crisis and subsequent volatile periods. This study develops an advanced expectile-based framework that addresses the shortcomings of conventional quantile-based approaches by providing greater sensitivity to tail losses and improved stability in extreme market conditions. The research employs a dataset spanning two decades of FTSE 100 returns, incorporating periods of high volatility, market crashes, and recovery phases. Our methodology introduces novel mathematical formulations for expectile regression models, enhanced threshold determination techniques using time series analysis, and robust backtesting procedures. The empirical results demonstrate that expectile-based Value-at-Risk (EVaR) consistently outperforms traditional VaR measures across various confidence levels and market conditions. The framework exhibits superior performance during volatile periods, with reduced model risk and enhanced predictive accuracy. Furthermore, the study establishes practical implementation guidelines for financial institutions and provides evidence-based recommendations for regulatory compliance and portfolio management. The findings contribute significantly to the literature on financial risk management and offer practical tools for practitioners dealing with volatile market environments.
Learn to Rank Risky Investors: A Case Study of Predicting Retail Traders' Behaviour and Profitability
Identifying risky traders with high profits in financial markets is crucial for market makers, such as trading exchanges, to ensure effective risk management through real-time decisions on regulation compliance and hedging. However, capturing the complex and dynamic behaviours of individual traders poses significant challenges. Traditional classification and anomaly detection methods often establish a fixed risk boundary, failing to account for this complexity and dynamism. To tackle this issue, we propose a profit-aware risk ranker (PA-RiskRanker) that reframes the problem of identifying risky traders as a ranking task using Learning-to-Rank (LETOR) algorithms. Our approach features a Profit-Aware binary cross entropy (PA-BCE) loss function and a transformer-based ranker enhanced with a self-cross-trader attention pipeline. These components effectively integrate profit and loss (P&L) considerations into the training process while capturing intra- and inter-trader relationships. Our research critically examines the limitations of existing deep learning-based LETOR algorithms in trading risk management, which often overlook the importance of P&L in financial scenarios. By prioritising P&L, our method improves risky trader identification, achieving an 8.4% increase in F1 score compared to state-of-the-art (SOTA) ranking models like Rankformer. Additionally, it demonstrates a 10%-17% increase in average profit compared to all benchmark models.
MM-DREX: Multimodal-Driven Dynamic Routing of LLM Experts for Financial Trading
The inherent non-stationarity of financial markets and the complexity of multi-modal information pose significant challenges to existing quantitative trading models. Traditional methods relying on fixed structures and unimodal data struggle to adapt to market regime shifts, while large language model (LLM)-driven solutions - despite their multi-modal comprehension - suffer from static strategies and homogeneous expert designs, lacking dynamic adjustment and fine-grained decision mechanisms. To address these limitations, we propose MM-DREX: a Multimodal-driven, Dynamically-Routed EXpert framework based on large language models. MM-DREX explicitly decouples market state perception from strategy execution to enable adaptive sequential decision-making in non-stationary environments. Specifically, it (1) introduces a vision-language model (VLM)-powered dynamic router that jointly analyzes candlestick chart patterns and long-term temporal features to allocate real-time expert weights; (2) designs four heterogeneous trading experts (trend, reversal, breakout, positioning) generating specialized fine-grained sub-strategies; and (3) proposes an SFT-RL hybrid training paradigm to synergistically optimize the router's market classification capability and experts' risk-adjusted decision-making. Extensive experiments on multi-modal datasets spanning stocks, futures, and cryptocurrencies demonstrate that MM-DREX significantly outperforms 15 baselines (including state-of-the-art financial LLMs and deep reinforcement learning models) across key metrics: total return, Sharpe ratio, and maximum drawdown, validating its robustness and generalization. Additionally, an interpretability module traces routing logic and expert behavior in real time, providing an audit trail for strategy transparency.
Sector Rotation by Factor Model and Fundamental Analysis
This study presents an analytical approach to sector rotation, leveraging both factor models and fundamental metrics. We initiate with a systematic classification of sectors, followed by an empirical investigation into their returns. Through factor analysis, the paper underscores the significance of momentum and short-term reversion in dictating sectoral shifts. A subsequent in-depth fundamental analysis evaluates metrics such as PE, PB, EV-to-EBITDA, Dividend Yield, among others. Our primary contribution lies in developing a predictive framework based on these fundamental indicators. The constructed models, post rigorous training, exhibit noteworthy predictive capabilities. The findings furnish a nuanced understanding of sector rotation strategies, with implications for asset management and portfolio construction in the financial domain.
When Agents Trade: Live Multi-Market Trading Benchmark for LLM Agents
Although Large Language Model (LLM)-based agents are increasingly used in financial trading, it remains unclear whether they can reason and adapt in live markets, as most studies test models instead of agents, cover limited periods and assets, and rely on unverified data. To address these gaps, we introduce Agent Market Arena (AMA), the first lifelong, real-time benchmark for evaluating LLM-based trading agents across multiple markets. AMA integrates verified trading data, expert-checked news, and diverse agent architectures within a unified trading framework, enabling fair and continuous comparison under real conditions. It implements four agents, including InvestorAgent as a single-agent baseline, TradeAgent and HedgeFundAgent with different risk styles, and DeepFundAgent with memory-based reasoning, and evaluates them across GPT-4o, GPT-4.1, Claude-3.5-haiku, Claude-sonnet-4, and Gemini-2.0-flash. Live experiments on both cryptocurrency and stock markets demonstrate that agent frameworks display markedly distinct behavioral patterns, spanning from aggressive risk-taking to conservative decision-making, whereas model backbones contribute less to outcome variation. AMA thus establishes a foundation for rigorous, reproducible, and continuously evolving evaluation of financial reasoning and trading intelligence in LLM-based agents.
Generative AI Enhanced Financial Risk Management Information Retrieval
Risk management in finance involves recognizing, evaluating, and addressing financial risks to maintain stability and ensure regulatory compliance. Extracting relevant insights from extensive regulatory documents is a complex challenge requiring advanced retrieval and language models. This paper introduces RiskData, a dataset specifically curated for finetuning embedding models in risk management, and RiskEmbed, a finetuned embedding model designed to improve retrieval accuracy in financial question-answering systems. The dataset is derived from 94 regulatory guidelines published by the Office of the Superintendent of Financial Institutions (OSFI) from 1991 to 2024. We finetune a state-of-the-art sentence BERT embedding model to enhance domain-specific retrieval performance typically for Retrieval-Augmented Generation (RAG) systems. Experimental results demonstrate that RiskEmbed significantly outperforms general-purpose and financial embedding models, achieving substantial improvements in ranking metrics. By open-sourcing both the dataset and the model, we provide a valuable resource for financial institutions and researchers aiming to develop more accurate and efficient risk management AI solutions.
Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models
We examine the potential of ChatGPT and other large language models in predicting stock market returns using news headlines. We use ChatGPT to assess whether each headline is good, bad, or neutral for firms' stock prices. We document a significantly positive correlation between ChatGPT scores and subsequent daily stock returns. We find that ChatGPT outperforms traditional sentiment analysis methods. More basic models such as GPT-1, GPT-2, and BERT cannot accurately forecast returns, indicating return predictability is an emerging capacity of complex language models. Long-short strategies based on ChatGPT-4 deliver the highest Sharpe ratio. Furthermore, we find predictability in both small and large stocks, suggesting market underreaction to company news. Predictability is stronger among smaller stocks and stocks with bad news, consistent with limits-to-arbitrage also playing an important role. Finally, we propose a new method to evaluate and understand the models' reasoning capabilities. Overall, our results suggest that incorporating advanced language models into the investment decision-making process can yield more accurate predictions and enhance the performance of quantitative trading strategies.
R&D-Agent-Quant: A Multi-Agent Framework for Data-Centric Factors and Model Joint Optimization
Financial markets pose fundamental challenges for asset return prediction due to their high dimensionality, non-stationarity, and persistent volatility. Despite advances in large language models and multi-agent systems, current quantitative research pipelines suffer from limited automation, weak interpretability, and fragmented coordination across key components such as factor mining and model innovation. In this paper, we propose R&D-Agent for Quantitative Finance, in short RD-Agent(Q), the first data-centric multi-agent framework designed to automate the full-stack research and development of quantitative strategies via coordinated factor-model co-optimization. RD-Agent(Q) decomposes the quant process into two iterative stages: a Research stage that dynamically sets goal-aligned prompts, formulates hypotheses based on domain priors, and maps them to concrete tasks, and a Development stage that employs a code-generation agent, Co-STEER, to implement task-specific code, which is then executed in real-market backtests. The two stages are connected through a feedback stage that thoroughly evaluates experimental outcomes and informs subsequent iterations, with a multi-armed bandit scheduler for adaptive direction selection. Empirically, RD-Agent(Q) achieves up to 2X higher annualized returns than classical factor libraries using 70% fewer factors, and outperforms state-of-the-art deep time-series models on real markets. Its joint factor-model optimization delivers a strong balance between predictive accuracy and strategy robustness. Our code is available at: https://github.com/microsoft/RD-Agent.
FinWorld: An All-in-One Open-Source Platform for End-to-End Financial AI Research and Deployment
Financial AI holds great promise for transforming modern finance, with the potential to support a wide range of tasks such as market forecasting, portfolio management, quantitative trading, and automated analysis. However, existing platforms remain limited in task coverage, lack robust multimodal data integration, and offer insufficient support for the training and deployment of large language models (LLMs). In response to these limitations, we present FinWorld, an all-in-one open-source platform that provides end-to-end support for the entire financial AI workflow, from data acquisition to experimentation and deployment. FinWorld distinguishes itself through native integration of heterogeneous financial data, unified support for diverse AI paradigms, and advanced agent automation, enabling seamless development and deployment. Leveraging data from 2 representative markets, 4 stock pools, and over 800 million financial data points, we conduct comprehensive experiments on 4 key financial AI tasks. These experiments systematically evaluate deep learning and reinforcement learning algorithms, with particular emphasis on RL-based finetuning for LLMs and LLM Agents. The empirical results demonstrate that FinWorld significantly enhances reproducibility, supports transparent benchmarking, and streamlines deployment, thereby providing a strong foundation for future research and real-world applications. Code is available at Github~https://github.com/DVampire/FinWorld.
Quantitative Trading using Deep Q Learning
Reinforcement learning (RL) is a subfield of machine learning that has been used in many fields, such as robotics, gaming, and autonomous systems. There has been growing interest in using RL for quantitative trading, where the goal is to make trades that generate profits in financial markets. This paper presents the use of RL for quantitative trading and reports a case study based on an RL-based trading algorithm. The results show that RL can be a useful tool for quantitative trading and can perform better than traditional trading algorithms. The use of reinforcement learning for quantitative trading is a promising area of research that can help develop more sophisticated and efficient trading systems. Future research can explore the use of other reinforcement learning techniques, the use of other data sources, and the testing of the system on a range of asset classes. Together, our work shows the potential in the use of reinforcement learning for quantitative trading and the need for further research and development in this area. By developing the sophistication and efficiency of trading systems, it may be possible to make financial markets more efficient and generate higher returns for investors.
Show me your NFT and I tell you how it will perform: Multimodal representation learning for NFT selling price prediction
Non-Fungible Tokens (NFTs) represent deeds of ownership, based on blockchain technologies and smart contracts, of unique crypto assets on digital art forms (e.g., artworks or collectibles). In the spotlight after skyrocketing in 2021, NFTs have attracted the attention of crypto enthusiasts and investors intent on placing promising investments in this profitable market. However, the NFT financial performance prediction has not been widely explored to date. In this work, we address the above problem based on the hypothesis that NFT images and their textual descriptions are essential proxies to predict the NFT selling prices. To this purpose, we propose MERLIN, a novel multimodal deep learning framework designed to train Transformer-based language and visual models, along with graph neural network models, on collections of NFTs' images and texts. A key aspect in MERLIN is its independence on financial features, as it exploits only the primary data a user interested in NFT trading would like to deal with, i.e., NFT images and textual descriptions. By learning dense representations of such data, a price-category classification task is performed by MERLIN models, which can also be tuned according to user preferences in the inference phase to mimic different risk-return investment profiles. Experimental evaluation on a publicly available dataset has shown that MERLIN models achieve significant performances according to several financial assessment criteria, fostering profitable investments, and also beating baseline machine-learning classifiers based on financial features.
Multimodal Deep Reinforcement Learning for Portfolio Optimization
We propose a reinforcement learning (RL) framework that leverages multimodal data including historical stock prices, sentiment analysis, and topic embeddings from news articles, to optimize trading strategies for SP100 stocks. Building upon recent advancements in financial reinforcement learning, we aim to enhance the state space representation by integrating financial sentiment data from SEC filings and news headlines and refining the reward function to better align with portfolio performance metrics. Our methodology includes deep reinforcement learning with state tensors comprising price data, sentiment scores, and news embeddings, processed through advanced feature extraction models like CNNs and RNNs. By benchmarking against traditional portfolio optimization techniques and advanced strategies, we demonstrate the efficacy of our approach in delivering superior portfolio performance. Empirical results showcase the potential of our agent to outperform standard benchmarks, especially when utilizing combined data sources under profit-based reward functions.
TradExpert: Revolutionizing Trading with Mixture of Expert LLMs
The integration of Artificial Intelligence (AI) in the financial domain has opened new avenues for quantitative trading, particularly through the use of Large Language Models (LLMs). However, the challenge of effectively synthesizing insights from diverse data sources and integrating both structured and unstructured data persists. This paper presents TradeExpert, a novel framework that employs a mix of experts (MoE) approach, using four specialized LLMs, each analyzing distinct sources of financial data, including news articles, market data, alpha factors, and fundamental data. The insights of these expert LLMs are further synthesized by a General Expert LLM to make a final prediction or decision. With specific prompts, TradeExpert can be switched between the prediction mode and the ranking mode for stock movement prediction and quantitative stock trading, respectively. In addition to existing benchmarks, we also release a large-scale financial dataset to comprehensively evaluate TradeExpert's effectiveness. Our experimental results demonstrate TradeExpert's superior performance across all trading scenarios.
Disentangled Structural and Featural Representation for Task-Agnostic Graph Valuation
With the emergence of data marketplaces, the demand for methods to assess the value of data has increased significantly. While numerous techniques have been proposed for this purpose, none have specifically addressed graphs as the main data modality. Graphs are widely used across various fields, ranging from chemical molecules to social networks. In this study, we break down graphs into two main components: structural and featural, and we focus on evaluating data without relying on specific task-related metrics, making it applicable in practical scenarios where validation requirements may be lacking. We introduce a novel framework called blind message passing, which aligns the seller's and buyer's graphs using a shared node permutation based on graph matching. This allows us to utilize the graph Wasserstein distance to quantify the differences in the structural distribution of graph datasets, called the structural disparities. We then consider featural aspects of buyers' and sellers' graphs for data valuation and capture their statistical similarities and differences, referred to as relevance and diversity, respectively. Our approach ensures that buyers and sellers remain unaware of each other's datasets. Our experiments on real datasets demonstrate the effectiveness of our approach in capturing the relevance, diversity, and structural disparities of seller data for buyers, particularly in graph-based data valuation scenarios.
TraDE: Transformers for Density Estimation
We present TraDE, a self-attention-based architecture for auto-regressive density estimation with continuous and discrete valued data. Our model is trained using a penalized maximum likelihood objective, which ensures that samples from the density estimate resemble the training data distribution. The use of self-attention means that the model need not retain conditional sufficient statistics during the auto-regressive process beyond what is needed for each covariate. On standard tabular and image data benchmarks, TraDE produces significantly better density estimates than existing approaches such as normalizing flow estimators and recurrent auto-regressive models. However log-likelihood on held-out data only partially reflects how useful these estimates are in real-world applications. In order to systematically evaluate density estimators, we present a suite of tasks such as regression using generated samples, out-of-distribution detection, and robustness to noise in the training data and demonstrate that TraDE works well in these scenarios.
ContestTrade: A Multi-Agent Trading System Based on Internal Contest Mechanism
In financial trading, large language model (LLM)-based agents demonstrate significant potential. However, the high sensitivity to market noise undermines the performance of LLM-based trading systems. To address this limitation, we propose a novel multi-agent system featuring an internal competitive mechanism inspired by modern corporate management structures. The system consists of two specialized teams: (1) Data Team - responsible for processing and condensing massive market data into diversified text factors, ensuring they fit the model's constrained context. (2) Research Team - tasked with making parallelized multipath trading decisions based on deep research methods. The core innovation lies in implementing a real-time evaluation and ranking mechanism within each team, driven by authentic market feedback. Each agent's performance undergoes continuous scoring and ranking, with only outputs from top-performing agents being adopted. The design enables the system to adaptively adjust to dynamic environment, enhances robustness against market noise and ultimately delivers superior trading performance. Experimental results demonstrate that our proposed system significantly outperforms prevailing multi-agent systems and traditional quantitative investment methods across diverse evaluation metrics. ContestTrade is open-sourced on GitHub at https://github.com/FinStep-AI/ContestTrade.
Advancing Exchange Rate Forecasting: Leveraging Machine Learning and AI for Enhanced Accuracy in Global Financial Markets
The prediction of foreign exchange rates, such as the US Dollar (USD) to Bangladeshi Taka (BDT), plays a pivotal role in global financial markets, influencing trade, investments, and economic stability. This study leverages historical USD/BDT exchange rate data from 2018 to 2023, sourced from Yahoo Finance, to develop advanced machine learning models for accurate forecasting. A Long Short-Term Memory (LSTM) neural network is employed, achieving an exceptional accuracy of 99.449%, a Root Mean Square Error (RMSE) of 0.9858, and a test loss of 0.8523, significantly outperforming traditional methods like ARIMA (RMSE 1.342). Additionally, a Gradient Boosting Classifier (GBC) is applied for directional prediction, with backtesting on a 10,000 initial capital revealing a 40.82% profitable trade rate, though resulting in a net loss of 20,653.25 over 49 trades. The study analyzes historical trends, showing a decline in BDT/USD rates from 0.012 to 0.009, and incorporates normalized daily returns to capture volatility. These findings highlight the potential of deep learning in forex forecasting, offering traders and policymakers robust tools to mitigate risks. Future work could integrate sentiment analysis and real-time economic indicators to further enhance model adaptability in volatile markets.
Will LLMs be Professional at Fund Investment? DeepFund: A Live Arena Perspective
Large Language Models (LLMs) have demonstrated impressive capabilities across various domains, but their effectiveness in financial decision-making remains inadequately evaluated. Current benchmarks primarily assess LLMs' understanding on financial documents rather than the ability to manage assets or dig out trading opportunities in dynamic market conditions. Despite the release of new benchmarks for evaluating diversified tasks on the financial domain, we identified four major problems in these benchmarks, which are data leakage, navel-gazing, over-intervention, and maintenance-hard. To pave the research gap, we introduce DeepFund, a comprehensive arena platform for evaluating LLM-based trading strategies in a live environment. Our approach implements a multi-agent framework where they serve as multiple key roles that realize the real-world investment decision processes. Moreover, we provide a web interface that visualizes LLMs' performance with fund investment metrics across different market conditions, enabling detailed comparative analysis. Through DeepFund, we aim to provide a more realistic and fair assessment on LLM's capabilities in fund investment, offering diversified insights and revealing their potential applications in real-world financial markets. Our code is publicly available at https://github.com/HKUSTDial/DeepFund.
Limit Order Book Dynamics in Matching Markets:Microstructure, Spread, and Execution Slippage
Conventional models of matching markets assume that monetary transfers can clear markets by compensating for utility differentials. However, empirical patterns show that such transfers often fail to close structural preference gaps. This paper introduces a market microstructure framework that models matching decisions as a limit order book system with rigid bid ask spreads. Individual preferences are represented by a latent preference state matrix, where the spread between an agent's internal ask price (the unconditional maximum) and the market's best bid (the reachable maximum) creates a structural liquidity constraint. We establish a Threshold Impossibility Theorem showing that linear compensation cannot close these spreads unless it induces a categorical identity shift. A dynamic discrete choice execution model further demonstrates that matches occur only when the market to book ratio crosses a time decaying liquidity threshold, analogous to order execution under inventory pressure. Numerical experiments validate persistent slippage, regional invariance of preference orderings, and high tier zero spread executions. The model provides a unified microstructure explanation for matching failures, compensation inefficiency, and post match regret in illiquid order driven environments.
Can LLM-based Financial Investing Strategies Outperform the Market in Long Run?
Large Language Models (LLMs) have recently been leveraged for asset pricing tasks and stock trading applications, enabling AI agents to generate investment decisions from unstructured financial data. However, most evaluations of LLM timing-based investing strategies are conducted on narrow timeframes and limited stock universes, overstating effectiveness due to survivorship and data-snooping biases. We critically assess their generalizability and robustness by proposing FINSABER, a backtesting framework evaluating timing-based strategies across longer periods and a larger universe of symbols. Systematic backtests over two decades and 100+ symbols reveal that previously reported LLM advantages deteriorate significantly under broader cross-section and over a longer-term evaluation. Our market regime analysis further demonstrates that LLM strategies are overly conservative in bull markets, underperforming passive benchmarks, and overly aggressive in bear markets, incurring heavy losses. These findings highlight the need to develop LLM strategies that are able to prioritise trend detection and regime-aware risk controls over mere scaling of framework complexity.
Modeling financial analysts' decision making via the pragmatics and semantics of earnings calls
Every fiscal quarter, companies hold earnings calls in which company executives respond to questions from analysts. After these calls, analysts often change their price target recommendations, which are used in equity research reports to help investors make decisions. In this paper, we examine analysts' decision making behavior as it pertains to the language content of earnings calls. We identify a set of 20 pragmatic features of analysts' questions which we correlate with analysts' pre-call investor recommendations. We also analyze the degree to which semantic and pragmatic features from an earnings call complement market data in predicting analysts' post-call changes in price targets. Our results show that earnings calls are moderately predictive of analysts' decisions even though these decisions are influenced by a number of other factors including private communication with company executives and market conditions. A breakdown of model errors indicates disparate performance on calls from different market sectors.
AI-Powered Energy Algorithmic Trading: Integrating Hidden Markov Models with Neural Networks
In quantitative finance, machine learning methods are essential for alpha generation. This study introduces a new approach that combines Hidden Markov Models (HMM) and neural networks, integrated with Black-Litterman portfolio optimization. During the COVID period (2019-2022), this dual-model approach achieved a 83% return with a Sharpe ratio of 0.77. It incorporates two risk models to enhance risk management, showing efficiency during volatile periods. The methodology was implemented on the QuantConnect platform, which was chosen for its robust framework and experimental reproducibility. The system, which predicts future price movements, includes a three-year warm-up to ensure proper algorithm function. It targets highly liquid, large-cap energy stocks to ensure stable and predictable performance while also considering broker payments. The dual-model alpha system utilizes log returns to select the optimal state based on the historical performance. It combines state predictions with neural network outputs, which are based on historical data, to generate trading signals. This study examined the architecture of the trading system, data pre-processing, training, and performance. The full code and backtesting data are available under the QuantConnect terms.
An Effective Data Creation Pipeline to Generate High-quality Financial Instruction Data for Large Language Model
At the beginning era of large language model, it is quite critical to generate a high-quality financial dataset to fine-tune a large language model for financial related tasks. Thus, this paper presents a carefully designed data creation pipeline for this purpose. Particularly, we initiate a dialogue between an AI investor and financial expert using ChatGPT and incorporate the feedback of human financial experts, leading to the refinement of the dataset. This pipeline yielded a robust instruction tuning dataset comprised of 103k multi-turn chats. Extensive experiments have been conducted on this dataset to evaluate the model's performance by adopting an external GPT-4 as the judge. The promising experimental results verify that our approach led to significant advancements in generating accurate, relevant, and financial-style responses from AI models, and thus providing a powerful tool for applications within the financial sector.
Revisiting Ensemble Methods for Stock Trading and Crypto Trading Tasks at ACM ICAIF FinRL Contest 2023-2024
Reinforcement learning has demonstrated great potential for performing financial tasks. However, it faces two major challenges: policy instability and sampling bottlenecks. In this paper, we revisit ensemble methods with massively parallel simulations on graphics processing units (GPUs), significantly enhancing the computational efficiency and robustness of trained models in volatile financial markets. Our approach leverages the parallel processing capability of GPUs to significantly improve the sampling speed for training ensemble models. The ensemble models combine the strengths of component agents to improve the robustness of financial decision-making strategies. We conduct experiments in both stock and cryptocurrency trading tasks to evaluate the effectiveness of our approach. Massively parallel simulation on a single GPU improves the sampling speed by up to 1,746times using 2,048 parallel environments compared to a single environment. The ensemble models have high cumulative returns and outperform some individual agents, reducing maximum drawdown by up to 4.17% and improving the Sharpe ratio by up to 0.21. This paper describes trading tasks at ACM ICAIF FinRL Contests in 2023 and 2024.
FinRpt: Dataset, Evaluation System and LLM-based Multi-agent Framework for Equity Research Report Generation
While LLMs have shown great success in financial tasks like stock prediction and question answering, their application in fully automating Equity Research Report generation remains uncharted territory. In this paper, we formulate the Equity Research Report (ERR) Generation task for the first time. To address the data scarcity and the evaluation metrics absence, we present an open-source evaluation benchmark for ERR generation - FinRpt. We frame a Dataset Construction Pipeline that integrates 7 financial data types and produces a high-quality ERR dataset automatically, which could be used for model training and evaluation. We also introduce a comprehensive evaluation system including 11 metrics to assess the generated ERRs. Moreover, we propose a multi-agent framework specifically tailored to address this task, named FinRpt-Gen, and train several LLM-based agents on the proposed datasets using Supervised Fine-Tuning and Reinforcement Learning. Experimental results indicate the data quality and metrics effectiveness of the benchmark FinRpt and the strong performance of FinRpt-Gen, showcasing their potential to drive innovation in the ERR generation field. All code and datasets are publicly available.
Stock Volatility Prediction Based on Transformer Model Using Mixed-Frequency Data
With the increasing volume of high-frequency data in the information age, both challenges and opportunities arise in the prediction of stock volatility. On one hand, the outcome of prediction using tradition method combining stock technical and macroeconomic indicators still leaves room for improvement; on the other hand, macroeconomic indicators and peoples' search record on those search engines affecting their interested topics will intuitively have an impact on the stock volatility. For the convenience of assessment of the influence of these indicators, macroeconomic indicators and stock technical indicators are then grouped into objective factors, while Baidu search indices implying people's interested topics are defined as subjective factors. To align different frequency data, we introduce GARCH-MIDAS model. After mixing all the above data, we then feed them into Transformer model as part of the training data. Our experiments show that this model outperforms the baselines in terms of mean square error. The adaption of both types of data under Transformer model significantly reduces the mean square error from 1.00 to 0.86.
Combining Deep Learning and GARCH Models for Financial Volatility and Risk Forecasting
In this paper, we develop a hybrid approach to forecasting the volatility and risk of financial instruments by combining common econometric GARCH time series models with deep learning neural networks. For the latter, we employ Gated Recurrent Unit (GRU) networks, whereas four different specifications are used as the GARCH component: standard GARCH, EGARCH, GJR-GARCH and APARCH. Models are tested using daily logarithmic returns on the S&P 500 index as well as gold price Bitcoin prices, with the three assets representing quite distinct volatility dynamics. As the main volatility estimator, also underlying the target function of our hybrid models, we use the price-range-based Garman-Klass estimator, modified to incorporate the opening and closing prices. Volatility forecasts resulting from the hybrid models are employed to evaluate the assets' risk using the Value-at-Risk (VaR) and Expected Shortfall (ES) at two different tolerance levels of 5% and 1%. Gains from combining the GARCH and GRU approaches are discussed in the contexts of both the volatility and risk forecasts. In general, it can be concluded that the hybrid solutions produce more accurate point volatility forecasts, although it does not necessarily translate into superior VaR and ES forecasts.
Quantformer: from attention to profit with a quantitative transformer trading strategy
In traditional quantitative trading practice, navigating the complicated and dynamic financial market presents a persistent challenge. Fully capturing various market variables, including long-term information, as well as essential signals that may lead to profit remains a difficult task for learning algorithms. In order to tackle this challenge, this paper introduces quantformer, an enhanced neural network architecture based on transformers, to build investment factors. By transfer learning from sentiment analysis, quantformer not only exploits its original inherent advantages in capturing long-range dependencies and modeling complex data relationships, but is also able to solve tasks with numerical inputs and accurately forecast future returns over a given period. This work collects more than 5,000,000 rolling data of 4,601 stocks in the Chinese capital market from 2010 to 2019. The results of this study demonstrated the model's superior performance in predicting stock trends compared with other 100 factor-based quantitative strategies. Notably, the model's innovative use of transformer-liked model to establish factors, in conjunction with market sentiment information, has been shown to enhance the accuracy of trading signals significantly, thereby offering promising implications for the future of quantitative trading strategies.
Should ChatGPT and Bard Share Revenue with Their Data Providers? A New Business Model for the AI Era
With various AI tools such as ChatGPT becoming increasingly popular, we are entering a true AI era. We can foresee that exceptional AI tools will soon reap considerable profits. A crucial question arise: should AI tools share revenue with their training data providers in additional to traditional stakeholders and shareholders? The answer is Yes. Large AI tools, such as large language models, always require more and better quality data to continuously improve, but current copyright laws limit their access to various types of data. Sharing revenue between AI tools and their data providers could transform the current hostile zero-sum game relationship between AI tools and a majority of copyrighted data owners into a collaborative and mutually beneficial one, which is necessary to facilitate the development of a virtuous cycle among AI tools, their users and data providers that drives forward AI technology and builds a healthy AI ecosystem. However, current revenue-sharing business models do not work for AI tools in the forthcoming AI era, since the most widely used metrics for website-based traffic and action, such as clicks, will be replaced by new metrics such as prompts and cost per prompt for generative AI tools. A completely new revenue-sharing business model, which must be almost independent of AI tools and be easily explained to data providers, needs to establish a prompt-based scoring system to measure data engagement of each data provider. This paper systematically discusses how to build such a scoring system for all data providers for AI tools based on classification and content similarity models, and outlines the requirements for AI tools or third parties to build it. Sharing revenue with data providers using such a scoring system would encourage more data owners to participate in the revenue-sharing program. This will be a utilitarian AI era where all parties benefit.
FinBloom: Knowledge Grounding Large Language Model with Real-time Financial Data
Large language models (LLMs) excel at generating human-like responses but often struggle with interactive tasks that require access to real-time information. This limitation poses challenges in finance, where models must access up-to-date information, such as recent news or price movements, to support decision-making. To address this, we introduce Financial Agent, a knowledge-grounding approach for LLMs to handle financial queries using real-time text and tabular data. Our contributions are threefold: First, we develop a Financial Context Dataset of over 50,000 financial queries paired with the required context. Second, we train FinBloom 7B, a custom 7 billion parameter LLM, on 14 million financial news articles from Reuters and Deutsche Presse-Agentur, alongside 12 million Securities and Exchange Commission (SEC) filings. Third, we fine-tune FinBloom 7B using the Financial Context Dataset to serve as a Financial Agent. This agent generates relevant financial context, enabling efficient real-time data retrieval to answer user queries. By reducing latency and eliminating the need for users to manually provide accurate data, our approach significantly enhances the capability of LLMs to handle dynamic financial tasks. Our proposed approach makes real-time financial decisions, algorithmic trading and other related tasks streamlined, and is valuable in contexts with high-velocity data flows.
Making Markets for Information Security: The Role of Online Platforms in Bug Bounty Programs
Security is an essential cornerstone of functioning digital marketplaces and communities. If users doubt that data shared online will remain secure, they will withdraw from platforms. Even when firms take these risks seriously, security expertise is expensive and vulnerabilities are diverse in nature. Increasingly, firms and governments are turning to bug bounty programs (BBPs) to crowdsource their cybersecurity, in which they pay individuals for reporting vulnerabilities in their systems. And while the use of BBPs has grown significantly in recent years, research on the actors in this market and their incentives remains limited. Using the lens of transaction cost economics, this paper examines the incentives of firms and researchers (sometimes called hackers) participating in BBPs. We study the crucial role that centralized platforms that organize BBPs play in this emerging market. We carry out an analysis of the HackerOne BBP platform, using a novel dataset on over 14,000 researchers reporting over 125,000 public vulnerabilities to over 500 firms from 2014 to the end of 2021. We outline how platforms like HackerOne make a market for information security vulnerabilities by reducing information asymmetries and their associated transaction costs.
Beyond the Mean: Limit Theory and Tests for Infinite-Mean Autoregressive Conditional Durations
Integrated autoregressive conditional duration (ACD) models serve as natural counterparts to the well-known integrated GARCH models used for financial returns. However, despite their resemblance, asymptotic theory for ACD is challenging and also not complete, in particular for integrated ACD. Central challenges arise from the facts that (i) integrated ACD processes imply durations with infinite expectation, and (ii) even in the non-integrated case, conventional asymptotic approaches break down due to the randomness in the number of durations within a fixed observation period. Addressing these challenges, we provide here unified asymptotic theory for the (quasi-) maximum likelihood estimator for ACD models; a unified theory which includes integrated ACD models. Based on the new results, we also provide a novel framework for hypothesis testing in duration models, enabling inference on a key empirical question: whether durations possess a finite or infinite expectation. We apply our results to high-frequency cryptocurrency ETF trading data. Motivated by parameter estimates near the integrated ACD boundary, we assess whether durations between trades in these markets have finite expectation, an assumption often made implicitly in the literature on point process models. Our empirical findings indicate infinite-mean durations for all the five cryptocurrencies examined, with the integrated ACD hypothesis rejected -- against alternatives with tail index less than one -- for four out of the five cryptocurrencies considered.
A Portfolio Rebalancing Approach for the Indian Stock Market
This chapter presents a calendar rebalancing approach to portfolios of stocks in the Indian stock market. Ten important sectors of the Indian economy are first selected. For each of these sectors, the top ten stocks are identified based on their free-float market capitalization values. Using the ten stocks in each sector, a sector-specific portfolio is designed. In this study, the historical stock prices are used from January 4, 2021, to September 20, 2023 (NSE Website). The portfolios are designed based on the training data from January 4, 2021 to June 30, 2022. The performances of the portfolios are tested over the period from July 1, 2022, to September 20, 2023. The calendar rebalancing approach presented in the chapter is based on a yearly rebalancing method. However, the method presented is perfectly flexible and can be adapted for weekly or monthly rebalancing. The rebalanced portfolios for the ten sectors are analyzed in detail for their performances. The performance results are not only indicative of the relative performances of the sectors over the training (i.e., in-sample) data and test (out-of-sample) data, but they also reflect the overall effectiveness of the proposed portfolio rebalancing approach.
QuantAgent: Price-Driven Multi-Agent LLMs for High-Frequency Trading
Recent advances in Large Language Models (LLMs) have demonstrated impressive capabilities in financial reasoning and market understanding. Multi-agent LLM frameworks such as TradingAgent and FINMEM augment these models to long-horizon investment tasks, leveraging fundamental and sentiment-based inputs for strategic decision-making. However, such systems are ill-suited for the high-speed, precision-critical demands of High-Frequency Trading (HFT). HFT requires rapid, risk-aware decisions based on structured, short-horizon signals, including technical indicators, chart patterns, and trend-based features, distinct from the long-term semantic reasoning typical of traditional financial LLM applications. To this end, we introduce QuantAgent, the first multi-agent LLM framework explicitly designed for high-frequency algorithmic trading. The system decomposes trading into four specialized agents, Indicator, Pattern, Trend, and Risk, each equipped with domain-specific tools and structured reasoning capabilities to capture distinct aspects of market dynamics over short temporal windows. In zero-shot evaluations across ten financial instruments, including Bitcoin and Nasdaq futures, QuantAgent demonstrates superior performance in both predictive accuracy and cumulative return over 4-hour trading intervals, outperforming strong neural and rule-based baselines. Our findings suggest that combining structured financial priors with language-native reasoning unlocks new potential for traceable, real-time decision systems in high-frequency financial markets.
SigFormer: Signature Transformers for Deep Hedging
Deep hedging is a promising direction in quantitative finance, incorporating models and techniques from deep learning research. While giving excellent hedging strategies, models inherently requires careful treatment in designing architectures for neural networks. To mitigate such difficulties, we introduce SigFormer, a novel deep learning model that combines the power of path signatures and transformers to handle sequential data, particularly in cases with irregularities. Path signatures effectively capture complex data patterns, while transformers provide superior sequential attention. Our proposed model is empirically compared to existing methods on synthetic data, showcasing faster learning and enhanced robustness, especially in the presence of irregular underlying price data. Additionally, we validate our model performance through a real-world backtest on hedging the SP 500 index, demonstrating positive outcomes.
A new paradigm for accelerating clinical data science at Stanford Medicine
Stanford Medicine is building a new data platform for our academic research community to do better clinical data science. Hospitals have a large amount of patient data and researchers have demonstrated the ability to reuse that data and AI approaches to derive novel insights, support patient care, and improve care quality. However, the traditional data warehouse and Honest Broker approaches that are in current use, are not scalable. We are establishing a new secure Big Data platform that aims to reduce time to access and analyze data. In this platform, data is anonymized to preserve patient data privacy and made available preparatory to Institutional Review Board (IRB) submission. Furthermore, the data is standardized such that analysis done at Stanford can be replicated elsewhere using the same analytical code and clinical concepts. Finally, the analytics data warehouse integrates with a secure data science computational facility to support large scale data analytics. The ecosystem is designed to bring the modern data science community to highly sensitive clinical data in a secure and collaborative big data analytics environment with a goal to enable bigger, better and faster science.
Predictive Crypto-Asset Automated Market Making Architecture for Decentralized Finance using Deep Reinforcement Learning
The study proposes a quote-driven predictive automated market maker (AMM) platform with on-chain custody and settlement functions, alongside off-chain predictive reinforcement learning capabilities to improve liquidity provision of real-world AMMs. The proposed AMM architecture is an augmentation to the Uniswap V3, a cryptocurrency AMM protocol, by utilizing a novel market equilibrium pricing for reduced divergence and slippage loss. Further, the proposed architecture involves a predictive AMM capability, utilizing a deep hybrid Long Short-Term Memory (LSTM) and Q-learning reinforcement learning framework that looks to improve market efficiency through better forecasts of liquidity concentration ranges, so liquidity starts moving to expected concentration ranges, prior to asset price movement, so that liquidity utilization is improved. The augmented protocol framework is expected have practical real-world implications, by (i) reducing divergence loss for liquidity providers, (ii) reducing slippage for crypto-asset traders, while (iii) improving capital efficiency for liquidity provision for the AMM protocol. To our best knowledge, there are no known protocol or literature that are proposing similar deep learning-augmented AMM that achieves similar capital efficiency and loss minimization objectives for practical real-world applications.
Intriguing Properties of Data Attribution on Diffusion Models
Data attribution seeks to trace model outputs back to training data. With the recent development of diffusion models, data attribution has become a desired module to properly assign valuations for high-quality or copyrighted training samples, ensuring that data contributors are fairly compensated or credited. Several theoretically motivated methods have been proposed to implement data attribution, in an effort to improve the trade-off between computational scalability and effectiveness. In this work, we conduct extensive experiments and ablation studies on attributing diffusion models, specifically focusing on DDPMs trained on CIFAR-10 and CelebA, as well as a Stable Diffusion model LoRA-finetuned on ArtBench. Intriguingly, we report counter-intuitive observations that theoretically unjustified design choices for attribution empirically outperform previous baselines by a large margin, in terms of both linear datamodeling score and counterfactual evaluation. Our work presents a significantly more efficient approach for attributing diffusion models, while the unexpected findings suggest that at least in non-convex settings, constructions guided by theoretical assumptions may lead to inferior attribution performance. The code is available at https://github.com/sail-sg/D-TRAK.
FinMultiTime: A Four-Modal Bilingual Dataset for Financial Time-Series Analysis
Pure time series forecasting tasks typically focus exclusively on numerical features; however, real-world financial decision-making demands the comparison and analysis of heterogeneous sources of information. Recent advances in deep learning and large scale language models (LLMs) have made significant strides in capturing sentiment and other qualitative signals, thereby enhancing the accuracy of financial time series predictions. Despite these advances, most existing datasets consist solely of price series and news text, are confined to a single market, and remain limited in scale. In this paper, we introduce FinMultiTime, the first large scale, multimodal financial time series dataset. FinMultiTime temporally aligns four distinct modalities financial news, structured financial tables, K-line technical charts, and stock price time series across both the S&P 500 and HS 300 universes. Covering 5,105 stocks from 2009 to 2025 in the United States and China, the dataset totals 112.6 GB and provides minute-level, daily, and quarterly resolutions, thus capturing short, medium, and long term market signals with high fidelity. Our experiments demonstrate that (1) scale and data quality markedly boost prediction accuracy; (2) multimodal fusion yields moderate gains in Transformer models; and (3) a fully reproducible pipeline enables seamless dataset updates.
Sentiment-Aware Mean-Variance Portfolio Optimization for Cryptocurrencies
This paper presents a dynamic cryptocurrency portfolio optimization strategy that integrates technical indicators and sentiment analysis to enhance investment decision-making. The proposed method employs the 14-day Relative Strength Index (RSI) and 14-day Simple Moving Average (SMA) to capture market momentum, while sentiment scores are extracted from news articles using the VADER (Valence Aware Dictionary and sEntiment Reasoner) model, with compound scores quantifying overall market tone. The large language model Google Gemini is used to further verify the sentiment scores predicted by VADER and give investment decisions. These technical indicator and sentiment signals are incorporated into the expected return estimates before applying mean-variance optimization with constraints on asset weights. The strategy is evaluated through a rolling-window backtest over cryptocurrency market data, with Bitcoin (BTC) and an equal-weighted portfolio of selected cryptocurrencies serving as benchmarks. Experimental results show that the proposed approach achieves a cumulative return of 38.72, substantially exceeding Bitcoin's 8.85 and the equal-weighted portfolio's 21.65 over the same period, and delivers a higher Sharpe ratio (1.1093 vs. 0.8853 and 1.0194, respectively). However, the strategy exhibits a larger maximum drawdown (-18.52%) compared to Bitcoin (-4.48%) and the equal-weighted portfolio (-11.02%), indicating higher short-term downside risk. These results highlight the potential of combining sentiment and technical signals to improve cryptocurrency portfolio performance, while also emphasizing the need to address risk exposure in volatile markets.
Customs Import Declaration Datasets
Given the huge volume of cross-border flows, effective and efficient control of trade becomes more crucial in protecting people and society from illicit trade. However, limited accessibility of the transaction-level trade datasets hinders the progress of open research, and lots of customs administrations have not benefited from the recent progress in data-based risk management. In this paper, we introduce an import declaration dataset to facilitate the collaboration between domain experts in customs administrations and researchers from diverse domains, such as data science and machine learning. The dataset contains 54,000 artificially generated trades with 22 key attributes, and it is synthesized with conditional tabular GAN while maintaining correlated features. Synthetic data has several advantages. First, releasing the dataset is free from restrictions that do not allow disclosing the original import data. The fabrication step minimizes the possible identity risk which may exist in trade statistics. Second, the published data follow a similar distribution to the source data so that it can be used in various downstream tasks. Hence, our dataset can be used as a benchmark for testing the performance of any classification algorithm. With the provision of data and its generation process, we open baseline codes for fraud detection tasks, as we empirically show that more advanced algorithms can better detect fraud.
Retrieval-augmented Large Language Models for Financial Time Series Forecasting
Stock movement prediction, a fundamental task in financial time-series forecasting, requires identifying and retrieving critical influencing factors from vast amounts of time-series data. However, existing text-trained or numeric similarity-based retrieval methods fall short in handling complex financial analysis. To address this, we propose the first retrieval-augmented generation (RAG) framework for financial time-series forecasting, featuring three key innovations: a fine-tuned 1B parameter large language model (StockLLM) as the backbone, a novel candidate selection method leveraging LLM feedback, and a training objective that maximizes similarity between queries and historically significant sequences. This enables our retriever, FinSeer, to uncover meaningful patterns while minimizing noise in complex financial data. We also construct new datasets integrating financial indicators and historical stock prices to train FinSeer and ensure robust evaluation. Experimental results demonstrate that our RAG framework outperforms bare StockLLM and random retrieval, highlighting its effectiveness, while FinSeer surpasses existing retrieval methods, achieving an 8\% higher accuracy on BIGDATA22 and retrieving more impactful sequences. This work underscores the importance of tailored retrieval models in financial forecasting and provides a novel framework for future research.
Cost-Sensitive Portfolio Selection via Deep Reinforcement Learning
Portfolio Selection is an important real-world financial task and has attracted extensive attention in artificial intelligence communities. This task, however, has two main difficulties: (i) the non-stationary price series and complex asset correlations make the learning of feature representation very hard; (ii) the practicality principle in financial markets requires controlling both transaction and risk costs. Most existing methods adopt handcraft features and/or consider no constraints for the costs, which may make them perform unsatisfactorily and fail to control both costs in practice. In this paper, we propose a cost-sensitive portfolio selection method with deep reinforcement learning. Specifically, a novel two-stream portfolio policy network is devised to extract both price series patterns and asset correlations, while a new cost-sensitive reward function is developed to maximize the accumulated return and constrain both costs via reinforcement learning. We theoretically analyze the near-optimality of the proposed reward, which shows that the growth rate of the policy regarding this reward function can approach the theoretical optimum. We also empirically evaluate the proposed method on real-world datasets. Promising results demonstrate the effectiveness and superiority of the proposed method in terms of profitability, cost-sensitivity and representation abilities.
QTMRL: An Agent for Quantitative Trading Decision-Making Based on Multi-Indicator Guided Reinforcement Learning
In the highly volatile and uncertain global financial markets, traditional quantitative trading models relying on statistical modeling or empirical rules often fail to adapt to dynamic market changes and black swan events due to rigid assumptions and limited generalization. To address these issues, this paper proposes QTMRL (Quantitative Trading Multi-Indicator Reinforcement Learning), an intelligent trading agent combining multi-dimensional technical indicators with reinforcement learning (RL) for adaptive and stable portfolio management. We first construct a comprehensive multi-indicator dataset using 23 years of S&P 500 daily OHLCV data (2000-2022) for 16 representative stocks across 5 sectors, enriching raw data with trend, volatility, and momentum indicators to capture holistic market dynamics. Then we design a lightweight RL framework based on the Advantage Actor-Critic (A2C) algorithm, including data processing, A2C algorithm, and trading agent modules to support policy learning and actionable trading decisions. Extensive experiments compare QTMRL with 9 baselines (e.g., ARIMA, LSTM, moving average strategies) across diverse market regimes, verifying its superiority in profitability, risk adjustment, and downside risk control. The code of QTMRL is publicly available at https://github.com/ChenJiahaoJNU/QTMRL.git
Forecasting S&P 500 Using LSTM Models
With the volatile and complex nature of financial data influenced by external factors, forecasting the stock market is challenging. Traditional models such as ARIMA and GARCH perform well with linear data but struggle with non-linear dependencies. Machine learning and deep learning models, particularly Long Short-Term Memory (LSTM) networks, address these challenges by capturing intricate patterns and long-term dependencies. This report compares ARIMA and LSTM models in predicting the S&P 500 index, a major financial benchmark. Using historical price data and technical indicators, we evaluated these models using Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). The ARIMA model showed reasonable performance with an MAE of 462.1, RMSE of 614, and 89.8 percent accuracy, effectively capturing short-term trends but limited by its linear assumptions. The LSTM model, leveraging sequential processing capabilities, outperformed ARIMA with an MAE of 369.32, RMSE of 412.84, and 92.46 percent accuracy, capturing both short- and long-term dependencies. Notably, the LSTM model without additional features performed best, achieving an MAE of 175.9, RMSE of 207.34, and 96.41 percent accuracy, showcasing its ability to handle market data efficiently. Accurately predicting stock movements is crucial for investment strategies, risk assessments, and market stability. Our findings confirm the potential of deep learning models in handling volatile financial data compared to traditional ones. The results highlight the effectiveness of LSTM and suggest avenues for further improvements. This study provides insights into financial forecasting, offering a comparative analysis of ARIMA and LSTM while outlining their strengths and limitations.
FinRobot: AI Agent for Equity Research and Valuation with Large Language Models
As financial markets grow increasingly complex, there is a rising need for automated tools that can effectively assist human analysts in equity research, particularly within sell-side research. While Generative AI (GenAI) has attracted significant attention in this field, existing AI solutions often fall short due to their narrow focus on technical factors and limited capacity for discretionary judgment. These limitations hinder their ability to adapt to new data in real-time and accurately assess risks, which diminishes their practical value for investors. This paper presents FinRobot, the first AI agent framework specifically designed for equity research. FinRobot employs a multi-agent Chain of Thought (CoT) system, integrating both quantitative and qualitative analyses to emulate the comprehensive reasoning of a human analyst. The system is structured around three specialized agents: the Data-CoT Agent, which aggregates diverse data sources for robust financial integration; the Concept-CoT Agent, which mimics an analysts reasoning to generate actionable insights; and the Thesis-CoT Agent, which synthesizes these insights into a coherent investment thesis and report. FinRobot provides thorough company analysis supported by precise numerical data, industry-appropriate valuation metrics, and realistic risk assessments. Its dynamically updatable data pipeline ensures that research remains timely and relevant, adapting seamlessly to new financial information. Unlike existing automated research tools, such as CapitalCube and Wright Reports, FinRobot delivers insights comparable to those produced by major brokerage firms and fundamental research vendors. We open-source FinRobot at https://github. com/AI4Finance-Foundation/FinRobot.
FinSearchComp: Towards a Realistic, Expert-Level Evaluation of Financial Search and Reasoning
Search has emerged as core infrastructure for LLM-based agents and is widely viewed as critical on the path toward more general intelligence. Finance is a particularly demanding proving ground: analysts routinely conduct complex, multi-step searches over time-sensitive, domain-specific data, making it ideal for assessing both search proficiency and knowledge-grounded reasoning. Yet no existing open financial datasets evaluate data searching capability of end-to-end agents, largely because constructing realistic, complicated tasks requires deep financial expertise and time-sensitive data is hard to evaluate. We present FinSearchComp, the first fully open-source agent benchmark for realistic, open-domain financial search and reasoning. FinSearchComp comprises three tasks -- Time-Sensitive Data Fetching, Simple Historical Lookup, and Complex Historical Investigation -- closely reproduce real-world financial analyst workflows. To ensure difficulty and reliability, we engage 70 professional financial experts for annotation and implement a rigorous multi-stage quality-assurance pipeline. The benchmark includes 635 questions spanning global and Greater China markets, and we evaluate 21 models (products) on it. Grok 4 (web) tops the global subset, approaching expert-level accuracy. DouBao (web) leads on the Greater China subset. Experimental analyses show that equipping agents with web search and financial plugins substantially improves results on FinSearchComp, and the country origin of models and tools impact performance significantly.By aligning with realistic analyst tasks and providing end-to-end evaluation, FinSearchComp offers a professional, high-difficulty testbed for complex financial search and reasoning.
Harmful Terms and Where to Find Them: Measuring and Modeling Unfavorable Financial Terms and Conditions in Shopping Websites at Scale
Terms and conditions for online shopping websites often contain terms that can have significant financial consequences for customers. Despite their impact, there is currently no comprehensive understanding of the types and potential risks associated with unfavorable financial terms. Furthermore, there are no publicly available detection systems or datasets to systematically identify or mitigate these terms. In this paper, we take the first steps toward solving this problem with three key contributions. First, we introduce TermMiner, an automated data collection and topic modeling pipeline to understand the landscape of unfavorable financial terms. Second, we create ShopTC-100K, a dataset of terms and conditions from shopping websites in the Tranco top 100K list, comprising 1.8 million terms from 8,251 websites. Consequently, we develop a taxonomy of 22 types from 4 categories of unfavorable financial terms -- spanning purchase, post-purchase, account termination, and legal aspects. Third, we build TermLens, an automated detector that uses Large Language Models (LLMs) to identify unfavorable financial terms. Fine-tuned on an annotated dataset, TermLens achieves an F1 score of 94.6\% and a false positive rate of 2.3\% using GPT-4o. When applied to shopping websites from the Tranco top 100K, we find that 42.06\% of these sites contain at least one unfavorable financial term, with such terms being more prevalent on less popular websites. Case studies further highlight the financial risks and customer dissatisfaction associated with unfavorable financial terms, as well as the limitations of existing ecosystem defenses.
Short-term Volatility Estimation for High Frequency Trades using Gaussian processes (GPs)
The fundamental theorem behind financial markets is that stock prices are intrinsically complex and stochastic. One of the complexities is the volatility associated with stock prices. Volatility is a tendency for prices to change unexpectedly [1]. Price volatility is often detrimental to the return economics, and thus, investors should factor it in whenever making investment decisions, choices, and temporal or permanent moves. It is, therefore, crucial to make necessary and regular short and long-term stock price volatility forecasts for the safety and economics of investors returns. These forecasts should be accurate and not misleading. Different models and methods, such as ARCH GARCH models, have been intuitively implemented to make such forecasts. However, such traditional means fail to capture the short-term volatility forecasts effectively. This paper, therefore, investigates and implements a combination of numeric and probabilistic models for short-term volatility and return forecasting for high-frequency trades. The essence is that one-day-ahead volatility forecasts were made with Gaussian Processes (GPs) applied to the outputs of a Numerical market prediction (NMP) model. Firstly, the stock price data from NMP was corrected by a GP. Since it is not easy to set price limits in a market due to its free nature and randomness, a Censored GP was used to model the relationship between the corrected stock prices and returns. Forecasting errors were evaluated using the implied and estimated data.
Reinforcement Learning and Deep Stochastic Optimal Control for Final Quadratic Hedging
We consider two data driven approaches, Reinforcement Learning (RL) and Deep Trajectory-based Stochastic Optimal Control (DTSOC) for hedging a European call option without and with transaction cost according to a quadratic hedging P&L objective at maturity ("variance-optimal hedging" or "final quadratic hedging"). We study the performance of the two approaches under various market environments (modeled via the Black-Scholes and/or the log-normal SABR model) to understand their advantages and limitations. Without transaction costs and in the Black-Scholes model, both approaches match the performance of the variance-optimal Delta hedge. In the log-normal SABR model without transaction costs, they match the performance of the variance-optimal Barlett's Delta hedge. Agents trained on Black-Scholes trajectories with matching initial volatility but used on SABR trajectories match the performance of Bartlett's Delta hedge in average cost, but show substantially wider variance. To apply RL approaches to these problems, P&L at maturity is written as sum of step-wise contributions and variants of RL algorithms are implemented and used that minimize expectation of second moments of such sums.
Stockformer: A Price-Volume Factor Stock Selection Model Based on Wavelet Transform and Multi-Task Self-Attention Networks
As the Chinese stock market continues to evolve and its market structure grows increasingly complex, traditional quantitative trading methods are facing escalating challenges. Particularly, due to policy uncertainty and the frequent market fluctuations triggered by sudden economic events, existing models often struggle to accurately predict market dynamics. To address these challenges, this paper introduces Stockformer, a price-volume factor stock selection model that integrates wavelet transformation and a multitask self-attention network, aimed at enhancing responsiveness and predictive accuracy regarding market instabilities. Through discrete wavelet transform, Stockformer decomposes stock returns into high and low frequencies, meticulously capturing long-term market trends and short-term fluctuations, including abrupt events. Moreover, the model incorporates a Dual-Frequency Spatiotemporal Encoder and graph embedding techniques to effectively capture complex temporal and spatial relationships among stocks. Employing a multitask learning strategy, it simultaneously predicts stock returns and directional trends. Experimental results show that Stockformer outperforms existing advanced methods on multiple real stock market datasets. In strategy backtesting, Stockformer consistently demonstrates exceptional stability and reliability across market conditions-whether rising, falling, or fluctuating-particularly maintaining high performance during downturns or volatile periods, indicating a high adaptability to market fluctuations. To foster innovation and collaboration in the financial analysis sector, the Stockformer model's code has been open-sourced and is available on the GitHub repository: https://github.com/Eric991005/Multitask-Stockformer.
TradingAgents: Multi-Agents LLM Financial Trading Framework
Significant progress has been made in automated problem-solving using societies of agents powered by large language models (LLMs). In finance, efforts have largely focused on single-agent systems handling specific tasks or multi-agent frameworks independently gathering data. However, the multi-agent systems' potential to replicate real-world trading firms' collaborative dynamics remains underexplored. TradingAgents proposes a novel stock trading framework inspired by trading firms, featuring LLM-powered agents in specialized roles such as fundamental analysts, sentiment analysts, technical analysts, and traders with varied risk profiles. The framework includes Bull and Bear researcher agents assessing market conditions, a risk management team monitoring exposure, and traders synthesizing insights from debates and historical data to make informed decisions. By simulating a dynamic, collaborative trading environment, this framework aims to improve trading performance. Detailed architecture and extensive experiments reveal its superiority over baseline models, with notable improvements in cumulative returns, Sharpe ratio, and maximum drawdown, highlighting the potential of multi-agent LLM frameworks in financial trading. TradingAgents is available at https://github.com/TauricResearch/TradingAgents.
Towards Standardization of Data Licenses: The Montreal Data License
This paper provides a taxonomy for the licensing of data in the fields of artificial intelligence and machine learning. The paper's goal is to build towards a common framework for data licensing akin to the licensing of open source software. Increased transparency and resolving conceptual ambiguities in existing licensing language are two noted benefits of the approach proposed in the paper. In parallel, such benefits may help foster fairer and more efficient markets for data through bringing about clearer tools and concepts that better define how data can be used in the fields of AI and ML. The paper's approach is summarized in a new family of data license language - the Montreal Data License (MDL). Alongside this new license, the authors and their collaborators have developed a web-based tool to generate license language espousing the taxonomies articulated in this paper.
Democratizing Tabular Data Access with an Openx2013Source Syntheticx2013Data SDK
Machine learning development critically depends on access to high-quality data. However, increasing restrictions due to privacy, proprietary interests, and ethical concerns have created significant barriers to data accessibility. Synthetic data offers a viable solution by enabling safe, broad data usage without compromising sensitive information. This paper presents the MOSTLY AI Synthetic Data Software Development Kit (SDK), an open-source toolkit designed specifically for synthesizing high-quality tabular data. The SDK integrates robust features such as differential privacy guarantees, fairness-aware data generation, and automated quality assurance into a flexible and accessible Python interface. Leveraging the TabularARGN autoregressive framework, the SDK supports diverse data types and complex multi-table and sequential datasets, delivering competitive performance with notable improvements in speed and usability. Currently deployed both as a cloud service and locally installable software, the SDK has seen rapid adoption, highlighting its practicality in addressing real-world data bottlenecks and promoting widespread data democratization.
A Time Series Analysis-Based Stock Price Prediction Using Machine Learning and Deep Learning Models
Prediction of future movement of stock prices has always been a challenging task for the researchers. While the advocates of the efficient market hypothesis (EMH) believe that it is impossible to design any predictive framework that can accurately predict the movement of stock prices, there are seminal work in the literature that have clearly demonstrated that the seemingly random movement patterns in the time series of a stock price can be predicted with a high level of accuracy. Design of such predictive models requires choice of appropriate variables, right transformation methods of the variables, and tuning of the parameters of the models. In this work, we present a very robust and accurate framework of stock price prediction that consists of an agglomeration of statistical, machine learning and deep learning models. We use the daily stock price data, collected at five minutes interval of time, of a very well known company that is listed in the National Stock Exchange (NSE) of India. The granular data is aggregated into three slots in a day, and the aggregated data is used for building and training the forecasting models. We contend that the agglomerative approach of model building that uses a combination of statistical, machine learning, and deep learning approaches, can very effectively learn from the volatile and random movement patterns in a stock price data. We build eight classification and eight regression models based on statistical and machine learning approaches. In addition to these models, a deep learning regression model using a long-and-short-term memory (LSTM) network is also built. Extensive results have been presented on the performance of these models, and the results are critically analyzed.
Simulating Financial Market via Large Language Model based Agents
Most economic theories typically assume that financial market participants are fully rational individuals and use mathematical models to simulate human behavior in financial markets. However, human behavior is often not entirely rational and is challenging to predict accurately with mathematical models. In this paper, we propose Agent-based Simulated Financial Market (ASFM), which first constructs a simulated stock market with a real order matching system. Then, we propose a large language model based agent as the stock trader, which contains the profile, observation, and tool-learning based action module. The trading agent can comprehensively understand current market dynamics and financial policy information, and make decisions that align with their trading strategy. In the experiments, we first verify that the reactions of our ASFM are consistent with the real stock market in two controllable scenarios. In addition, we also conduct experiments in two popular economics research directions, and we find that conclusions drawn in our \model align with the preliminary findings in economics research. Based on these observations, we believe our proposed ASFM provides a new paradigm for economic research.
Using Data Analytics to Derive Business Intelligence: A Case Study
The data revolution experienced in recent times has thrown up new challenges and opportunities for businesses of all sizes in diverse industries. Big data analytics is already at the forefront of innovations to help make meaningful business decisions from the abundance of raw data available today. Business intelligence and analytics has become a huge trend in todays IT world as companies of all sizes are looking to improve their business processes and scale up using data driven solutions. This paper aims to demonstrate the data analytical process of deriving business intelligence via the historical data of a fictional bike share company seeking to find innovative ways to convert their casual riders to annual paying registered members. The dataset used is freely available as Chicago Divvy Bicycle Sharing Data on Kaggle. The authors used the RTidyverse library in RStudio to analyse the data and followed the six data analysis steps of ask, prepare, process, analyse, share, and act to recommend some actionable approaches the company could adopt to convert casual riders to paying annual members. The findings from this research serve as a valuable case example, of a real world deployment of BIA technologies in the industry, and a demonstration of the data analysis cycle for data practitioners, researchers, and other potential users.
Robust Budget Pacing with a Single Sample
Major Internet advertising platforms offer budget pacing tools as a standard service for advertisers to manage their ad campaigns. Given the inherent non-stationarity in an advertiser's value and also competing advertisers' values over time, a commonly used approach is to learn a target expenditure plan that specifies a target spend as a function of time, and then run a controller that tracks this plan. This raises the question: how many historical samples are required to learn a good expenditure plan? We study this question by considering an advertiser repeatedly participating in T second-price auctions, where the tuple of her value and the highest competing bid is drawn from an unknown time-varying distribution. The advertiser seeks to maximize her total utility subject to her budget constraint. Prior work has shown the sufficiency of Tlog T samples per distribution to achieve the optimal O(T)-regret. We dramatically improve this state-of-the-art and show that just one sample per distribution is enough to achieve the near-optimal tilde O(T)-regret, while still being robust to noise in the sampling distributions.
Feature Learning for Stock Price Prediction Shows a Significant Role of Analyst Rating
To reject the Efficient Market Hypothesis a set of 5 technical indicators and 23 fundamental indicators was identified to establish the possibility of generating excess returns on the stock market. Leveraging these data points and various classification machine learning models, trading data of the 505 equities on the US S&P500 over the past 20 years was analysed to develop a classifier effective for our cause. From any given day, we were able to predict the direction of change in price by 1% up to 10 days in the future. The predictions had an overall accuracy of 83.62% with a precision of 85% for buy signals and a recall of 100% for sell signals. Moreover, we grouped equities by their sector and repeated the experiment to see if grouping similar assets together positively effected the results but concluded that it showed no significant improvements in the performance rejecting the idea of sector-based analysis. Also, using feature ranking we could identify an even smaller set of 6 indicators while maintaining similar accuracies as that from the original 28 features and also uncovered the importance of buy, hold and sell analyst ratings as they came out to be the top contributors in the model. Finally, to evaluate the effectiveness of the classifier in real-life situations, it was backtested on FAANG equities using a modest trading strategy where it generated high returns of above 60% over the term of the testing dataset. In conclusion, our proposed methodology with the combination of purposefully picked features shows an improvement over the previous studies, and our model predicts the direction of 1% price changes on the 10th day with high confidence and with enough buffer to even build a robotic trading system.
Designing Efficient Pair-Trading Strategies Using Cointegration for the Indian Stock Market
A pair-trading strategy is an approach that utilizes the fluctuations between prices of a pair of stocks in a short-term time frame, while in the long-term the pair may exhibit a strong association and co-movement pattern. When the prices of the stocks exhibit significant divergence, the shares of the stock that gains in price are sold (a short strategy) while the shares of the other stock whose price falls are bought (a long strategy). This paper presents a cointegration-based approach that identifies stocks listed in the five sectors of the National Stock Exchange (NSE) of India for designing efficient pair-trading portfolios. Based on the stock prices from Jan 1, 2018, to Dec 31, 2020, the cointegrated stocks are identified and the pairs are formed. The pair-trading portfolios are evaluated on their annual returns for the year 2021. The results show that the pairs of stocks from the auto and the realty sectors, in general, yielded the highest returns among the five sectors studied in the work. However, two among the five pairs from the information technology (IT) sector are found to have yielded negative returns.
Large Language Model Agent in Financial Trading: A Survey
Trading is a highly competitive task that requires a combination of strategy, knowledge, and psychological fortitude. With the recent success of large language models(LLMs), it is appealing to apply the emerging intelligence of LLM agents in this competitive arena and understanding if they can outperform professional traders. In this survey, we provide a comprehensive review of the current research on using LLMs as agents in financial trading. We summarize the common architecture used in the agent, the data inputs, and the performance of LLM trading agents in backtesting as well as the challenges presented in these research. This survey aims to provide insights into the current state of LLM-based financial trading agents and outline future research directions in this field.
Financial Risk Assessment via Long-term Payment Behavior Sequence Folding
Online inclusive financial services encounter significant financial risks due to their expansive user base and low default costs. By real-world practice, we reveal that utilizing longer-term user payment behaviors can enhance models' ability to forecast financial risks. However, learning long behavior sequences is non-trivial for deep sequential models. Additionally, the diverse fields of payment behaviors carry rich information, requiring thorough exploitation. These factors collectively complicate the task of long-term user behavior modeling. To tackle these challenges, we propose a Long-term Payment Behavior Sequence Folding method, referred to as LBSF. In LBSF, payment behavior sequences are folded based on merchants, using the merchant field as an intrinsic grouping criterion, which enables informative parallelism without reliance on external knowledge. Meanwhile, we maximize the utility of payment details through a multi-field behavior encoding mechanism. Subsequently, behavior aggregation at the merchant level followed by relational learning across merchants facilitates comprehensive user financial representation. We evaluate LBSF on the financial risk assessment task using a large-scale real-world dataset. The results demonstrate that folding long behavior sequences based on internal behavioral cues effectively models long-term patterns and changes, thereby generating more accurate user financial profiles for practical applications.
Blockchain-Based Federated Learning: Incentivizing Data Sharing and Penalizing Dishonest Behavior
With the increasing importance of data sharing for collaboration and innovation, it is becoming more important to ensure that data is managed and shared in a secure and trustworthy manner. Data governance is a common approach to managing data, but it faces many challenges such as data silos, data consistency, privacy, security, and access control. To address these challenges, this paper proposes a comprehensive framework that integrates data trust in federated learning with InterPlanetary File System, blockchain, and smart contracts to facilitate secure and mutually beneficial data sharing while providing incentives, access control mechanisms, and penalizing any dishonest behavior. The experimental results demonstrate that the proposed model is effective in improving the accuracy of federated learning models while ensuring the security and fairness of the data-sharing process. The research paper also presents a decentralized federated learning platform that successfully trained a CNN model on the MNIST dataset using blockchain technology. The platform enables multiple workers to train the model simultaneously while maintaining data privacy and security. The decentralized architecture and use of blockchain technology allow for efficient communication and coordination between workers. This platform has the potential to facilitate decentralized machine learning and support privacy-preserving collaboration in various domains.
Convolutional Feature Extraction and Neural Arithmetic Logic Units for Stock Prediction
Stock prediction is a topic undergoing intense study for many years. Finance experts and mathematicians have been working on a way to predict the future stock price so as to decide to buy the stock or sell it to make profit. Stock experts or economists, usually analyze on the previous stock values using technical indicators, sentiment analysis etc to predict the future stock price. In recent years, many researches have extensively used machine learning for predicting the stock behaviour. In this paper we propose data driven deep learning approach to predict the future stock value with the previous price with the feature extraction property of convolutional neural network and to use Neural Arithmetic Logic Units with it.
Constructing Time-Series Momentum Portfolios with Deep Multi-Task Learning
A diversified risk-adjusted time-series momentum (TSMOM) portfolio can deliver substantial abnormal returns and offer some degree of tail risk protection during extreme market events. The performance of existing TSMOM strategies, however, relies not only on the quality of the momentum signal but also on the efficacy of the volatility estimator. Yet many of the existing studies have always considered these two factors to be independent. Inspired by recent progress in Multi-Task Learning (MTL), we present a new approach using MTL in a deep neural network architecture that jointly learns portfolio construction and various auxiliary tasks related to volatility, such as forecasting realized volatility as measured by different volatility estimators. Through backtesting from January 2000 to December 2020 on a diversified portfolio of continuous futures contracts, we demonstrate that even after accounting for transaction costs of up to 3 basis points, our approach outperforms existing TSMOM strategies. Moreover, experiments confirm that adding auxiliary tasks indeed boosts the portfolio's performance. These findings demonstrate that MTL can be a powerful tool in finance.
AI Governance through Markets
This paper argues that market governance mechanisms should be considered a key approach in the governance of artificial intelligence (AI), alongside traditional regulatory frameworks. While current governance approaches have predominantly focused on regulation, we contend that market-based mechanisms offer effective incentives for responsible AI development. We examine four emerging vectors of market governance: insurance, auditing, procurement, and due diligence, demonstrating how these mechanisms can affirm the relationship between AI risk and financial risk while addressing capital allocation inefficiencies. While we do not claim that market forces alone can adequately protect societal interests, we maintain that standardised AI disclosures and market mechanisms can create powerful incentives for safe and responsible AI development. This paper urges regulators, economists, and machine learning researchers to investigate and implement market-based approaches to AI governance.
Magentic Marketplace: An Open-Source Environment for Studying Agentic Markets
As LLM agents advance, they are increasingly mediating economic decisions, ranging from product discovery to transactions, on behalf of users. Such applications promise benefits but also raise many questions about agent accountability and value for users. Addressing these questions requires understanding how agents behave in realistic market conditions. However, previous research has largely evaluated agents in constrained settings, such as single-task marketplaces (e.g., negotiation) or structured two-agent interactions. Real-world markets are fundamentally different: they require agents to handle diverse economic activities and coordinate within large, dynamic ecosystems where multiple agents with opaque behaviors may engage in open-ended dialogues. To bridge this gap, we investigate two-sided agentic marketplaces where Assistant agents represent consumers and Service agents represent competing businesses. To study these interactions safely, we develop Magentic-Marketplace-- a simulated environment where Assistants and Services can operate. This environment enables us to study key market dynamics: the utility agents achieve, behavioral biases, vulnerability to manipulation, and how search mechanisms shape market outcomes. Our experiments show that frontier models can approach optimal welfare-- but only under ideal search conditions. Performance degrades sharply with scale, and all models exhibit severe first-proposal bias, creating 10-30x advantages for response speed over quality. These findings reveal how behaviors emerge across market conditions, informing the design of fair and efficient agentic marketplaces.
Decongestion by Representation: Learning to Improve Economic Welfare in Marketplaces
Congestion is a common failure mode of markets, where consumers compete inefficiently on the same subset of goods (e.g., chasing the same small set of properties on a vacation rental platform). The typical economic story is that prices decongest by balancing supply and demand. But in modern online marketplaces, prices are typically set in a decentralized way by sellers, and the information about items is inevitably partial. The power of a platform is limited to controlling representations -- the subset of information about items presented by default to users. This motivates the present study of decongestion by representation, where a platform seeks to learn representations that reduce congestion and thus improve social welfare. The technical challenge is twofold: relying only on revealed preferences from the choices of consumers, rather than true preferences; and the combinatorial problem associated with representations that determine the features to reveal in the default view. We tackle both challenges by proposing a differentiable proxy of welfare that can be trained end-to-end on consumer choice data. We develop sufficient conditions for when decongestion promotes welfare, and present the results of extensive experiments on both synthetic and real data that demonstrate the utility of our approach.
Stock Volatility Prediction using Time Series and Deep Learning Approach
Volatility clustering is a crucial property that has a substantial impact on stock market patterns. Nonetheless, developing robust models for accurately predicting future stock price volatility is a difficult research topic. For predicting the volatility of three equities listed on India's national stock market (NSE), we propose multiple volatility models depending on the generalized autoregressive conditional heteroscedasticity (GARCH), Glosten-Jagannathan-GARCH (GJR-GARCH), Exponential general autoregressive conditional heteroskedastic (EGARCH), and LSTM framework. Sector-wise stocks have been chosen in our study. The sectors which have been considered are banking, information technology (IT), and pharma. yahoo finance has been used to obtain stock price data from Jan 2017 to Dec 2021. Among the pulled-out records, the data from Jan 2017 to Dec 2020 have been taken for training, and data from 2021 have been chosen for testing our models. The performance of predicting the volatility of stocks of three sectors has been evaluated by implementing three different types of GARCH models as well as by the LSTM model are compared. It has been observed the LSTM performed better in predicting volatility in pharma over banking and IT sectors. In tandem, it was also observed that E-GARCH performed better in the case of the banking sector and for IT and pharma, GJR-GARCH performed better.
ATLAS: Adaptive Trading with LLM AgentS Through Dynamic Prompt Optimization and Multi-Agent Coordination
Large language models show promise for financial decision-making, yet deploying them as autonomous trading agents raises fundamental challenges: how to adapt instructions when rewards arrive late and obscured by market noise, how to synthesize heterogeneous information streams into coherent decisions, and how to bridge the gap between model outputs and executable market actions. We present ATLAS (Adaptive Trading with LLM AgentS), a unified multi-agent framework that integrates structured information from markets, news, and corporate fundamentals to support robust trading decisions. Within ATLAS, the central trading agent operates in an order-aware action space, ensuring that outputs correspond to executable market orders rather than abstract signals. The agent can incorporate feedback while trading using Adaptive-OPRO, a novel prompt-optimization technique that dynamically adapts the prompt by incorporating real-time, stochastic feedback, leading to increasing performance over time. Across regime-specific equity studies and multiple LLM families, Adaptive-OPRO consistently outperforms fixed prompts, while reflection-based feedback fails to provide systematic gains.
CriteoPrivateAd: A Real-World Bidding Dataset to Design Private Advertising Systems
In the past years, many proposals have emerged in order to address online advertising use-cases without access to third-party cookies. All these proposals leverage some privacy-enhancing technologies such as aggregation or differential privacy. Yet, no public and rich-enough ground truth is currently available to assess the relevancy of aforementioned private advertising frameworks. We are releasing the largest, in terms of number of features, bidding dataset specifically built in alignment with the design of major browser vendors proposals such as Chrome Privacy Sandbox. This dataset, coined CriteoPrivateAd, stands for an anonymised version of Criteo production logs and provides sufficient data to learn bidding models commonly used in online advertising under many privacy constraints (delayed reports, display and user-level differential privacy, user signal quantisation or aggregated reports). We ensured that this dataset, while being anonymised, is able to provide offline results close to production performance of adtech companies including Criteo - making it a relevant ground truth to design private advertising systems. The dataset is available in Hugging Face: https://huggingface.co/datasets/criteo/CriteoPrivateAd.
INVESTORBENCH: A Benchmark for Financial Decision-Making Tasks with LLM-based Agent
Recent advancements have underscored the potential of large language model (LLM)-based agents in financial decision-making. Despite this progress, the field currently encounters two main challenges: (1) the lack of a comprehensive LLM agent framework adaptable to a variety of financial tasks, and (2) the absence of standardized benchmarks and consistent datasets for assessing agent performance. To tackle these issues, we introduce InvestorBench, the first benchmark specifically designed for evaluating LLM-based agents in diverse financial decision-making contexts. InvestorBench enhances the versatility of LLM-enabled agents by providing a comprehensive suite of tasks applicable to different financial products, including single equities like stocks, cryptocurrencies and exchange-traded funds (ETFs). Additionally, we assess the reasoning and decision-making capabilities of our agent framework using thirteen different LLMs as backbone models, across various market environments and tasks. Furthermore, we have curated a diverse collection of open-source, multi-modal datasets and developed a comprehensive suite of environments for financial decision-making. This establishes a highly accessible platform for evaluating financial agents' performance across various scenarios.
AI PB: A Grounded Generative Agent for Personalized Investment Insights
We present AI PB, a production-scale generative agent deployed in real retail finance. Unlike reactive chatbots that answer queries passively, AI PB proactively generates grounded, compliant, and user-specific investment insights. It integrates (i) a component-based orchestration layer that deterministically routes between internal and external LLMs based on data sensitivity, (ii) a hybrid retrieval pipeline using OpenSearch and the finance-domain embedding model, and (iii) a multi-stage recommendation mechanism combining rule heuristics, sequential behavioral modeling, and contextual bandits. Operating fully on-premises under Korean financial regulations, the system employs Docker Swarm and vLLM across 24 X NVIDIA H100 GPUs. Through human QA and system metrics, we demonstrate that grounded generation with explicit routing and layered safety can deliver trustworthy AI insights in high-stakes finance.
Statistical Inference and A/B Testing for First-Price Pacing Equilibria
We initiate the study of statistical inference and A/B testing for first-price pacing equilibria (FPPE). The FPPE model captures the dynamics resulting from large-scale first-price auction markets where buyers use pacing-based budget management. Such markets arise in the context of internet advertising, where budgets are prevalent. We propose a statistical framework for the FPPE model, in which a limit FPPE with a continuum of items models the long-run steady-state behavior of the auction platform, and an observable FPPE consisting of a finite number of items provides the data to estimate primitives of the limit FPPE, such as revenue, Nash social welfare (a fair metric of efficiency), and other parameters of interest. We develop central limit theorems and asymptotically valid confidence intervals. Furthermore, we establish the asymptotic local minimax optimality of our estimators. We then show that the theory can be used for conducting statistically valid A/B testing on auction platforms. Numerical simulations verify our central limit theorems, and empirical coverage rates for our confidence intervals agree with our theory.
Coordinated Dynamic Bidding in Repeated Second-Price Auctions with Budgets
In online ad markets, a rising number of advertisers are employing bidding agencies to participate in ad auctions. These agencies are specialized in designing online algorithms and bidding on behalf of their clients. Typically, an agency usually has information on multiple advertisers, so she can potentially coordinate bids to help her clients achieve higher utilities than those under independent bidding. In this paper, we study coordinated online bidding algorithms in repeated second-price auctions with budgets. We propose algorithms that guarantee every client a higher utility than the best she can get under independent bidding. We show that these algorithms achieve maximal coalition welfare and discuss bidders' incentives to misreport their budgets, in symmetric cases. Our proofs combine the techniques of online learning and equilibrium analysis, overcoming the difficulty of competing with a multi-dimensional benchmark. The performance of our algorithms is further evaluated by experiments on both synthetic and real data. To the best of our knowledge, we are the first to consider bidder coordination in online repeated auctions with constraints.
Pre-training Time Series Models with Stock Data Customization
Stock selection, which aims to predict stock prices and identify the most profitable ones, is a crucial task in finance. While existing methods primarily focus on developing model structures and building graphs for improved selection, pre-training strategies remain underexplored in this domain. Current stock series pre-training follows methods from other areas without adapting to the unique characteristics of financial data, particularly overlooking stock-specific contextual information and the non-stationary nature of stock prices. Consequently, the latent statistical features inherent in stock data are underutilized. In this paper, we propose three novel pre-training tasks tailored to stock data characteristics: stock code classification, stock sector classification, and moving average prediction. We develop the Stock Specialized Pre-trained Transformer (SSPT) based on a two-layer transformer architecture. Extensive experimental results validate the effectiveness of our pre-training methods and provide detailed guidance on their application. Evaluations on five stock datasets, including four markets and two time periods, demonstrate that SSPT consistently outperforms the market and existing methods in terms of both cumulative investment return ratio and Sharpe ratio. Additionally, our experiments on simulated data investigate the underlying mechanisms of our methods, providing insights into understanding price series. Our code is publicly available at: https://github.com/astudentuser/Pre-training-Time-Series-Models-with-Stock-Data-Customization.
Can Large Language Models Beat Wall Street? Unveiling the Potential of AI in Stock Selection
This paper introduces MarketSenseAI, an innovative framework leveraging GPT-4's advanced reasoning for selecting stocks in financial markets. By integrating Chain of Thought and In-Context Learning, MarketSenseAI analyzes diverse data sources, including market trends, news, fundamentals, and macroeconomic factors, to emulate expert investment decision-making. The development, implementation, and validation of the framework are elaborately discussed, underscoring its capability to generate actionable and interpretable investment signals. A notable feature of this work is employing GPT-4 both as a predictive mechanism and signal evaluator, revealing the significant impact of the AI-generated explanations on signal accuracy, reliability and acceptance. Through empirical testing on the competitive S&P 100 stocks over a 15-month period, MarketSenseAI demonstrated exceptional performance, delivering excess alpha of 10% to 30% and achieving a cumulative return of up to 72% over the period, while maintaining a risk profile comparable to the broader market. Our findings highlight the transformative potential of Large Language Models in financial decision-making, marking a significant leap in integrating generative AI into financial analytics and investment strategies.
CTAB-GAN+: Enhancing Tabular Data Synthesis
While data sharing is crucial for knowledge development, privacy concerns and strict regulation (e.g., European General Data Protection Regulation (GDPR)) limit its full effectiveness. Synthetic tabular data emerges as alternative to enable data sharing while fulfilling regulatory and privacy constraints. State-of-the-art tabular data synthesizers draw methodologies from Generative Adversarial Networks (GAN). As GANs improve the synthesized data increasingly resemble the real data risking to leak privacy. Differential privacy (DP) provides theoretical guarantees on privacy loss but degrades data utility. Striking the best trade-off remains yet a challenging research question. We propose CTAB-GAN+ a novel conditional tabular GAN. CTAB-GAN+ improves upon state-of-the-art by (i) adding downstream losses to conditional GANs for higher utility synthetic data in both classification and regression domains; (ii) using Wasserstein loss with gradient penalty for better training convergence; (iii) introducing novel encoders targeting mixed continuous-categorical variables and variables with unbalanced or skewed data; and (iv) training with DP stochastic gradient descent to impose strict privacy guarantees. We extensively evaluate CTAB-GAN+ on data similarity and analysis utility against state-of-the-art tabular GANs. The results show that CTAB-GAN+ synthesizes privacy-preserving data with at least 48.16% higher utility across multiple datasets and learning tasks under different privacy budgets.
Quantum computational finance: quantum algorithm for portfolio optimization
We present a quantum algorithm for portfolio optimization. We discuss the market data input, the processing of such data via quantum operations, and the output of financially relevant results. Given quantum access to the historical record of returns, the algorithm determines the optimal risk-return tradeoff curve and allows one to sample from the optimal portfolio. The algorithm can in principle attain a run time of {rm poly}(log(N)), where N is the size of the historical return dataset. Direct classical algorithms for determining the risk-return curve and other properties of the optimal portfolio take time {rm poly}(N) and we discuss potential quantum speedups in light of the recent works on efficient classical sampling approaches.
Deep Reinforcement Learning in Cryptocurrency Market Making
This paper sets forth a framework for deep reinforcement learning as applied to market making (DRLMM) for cryptocurrencies. Two advanced policy gradient-based algorithms were selected as agents to interact with an environment that represents the observation space through limit order book data, and order flow arrival statistics. Within the experiment, a forward-feed neural network is used as the function approximator and two reward functions are compared. The performance of each combination of agent and reward function is evaluated by daily and average trade returns. Using this DRLMM framework, this paper demonstrates the effectiveness of deep reinforcement learning in solving stochastic inventory control challenges market makers face.
FinVision: A Multi-Agent Framework for Stock Market Prediction
Financial trading has been a challenging task, as it requires the integration of vast amounts of data from various modalities. Traditional deep learning and reinforcement learning methods require large training data and often involve encoding various data types into numerical formats for model input, which limits the explainability of model behavior. Recently, LLM-based agents have demonstrated remarkable advancements in handling multi-modal data, enabling them to execute complex, multi-step decision-making tasks while providing insights into their thought processes. This research introduces a multi-modal multi-agent system designed specifically for financial trading tasks. Our framework employs a team of specialized LLM-based agents, each adept at processing and interpreting various forms of financial data, such as textual news reports, candlestick charts, and trading signal charts. A key feature of our approach is the integration of a reflection module, which conducts analyses of historical trading signals and their outcomes. This reflective process is instrumental in enhancing the decision-making capabilities of the system for future trading scenarios. Furthermore, the ablation studies indicate that the visual reflection module plays a crucial role in enhancing the decision-making capabilities of our framework.
EmTract: Investor Emotions and Market Behavior
We develop a tool that extracts emotions from social media text data. Our methodology has three main advantages. First, it is tailored for financial context; second, it incorporates key aspects of social media data, such as non-standard phrases, emojis and emoticons; and third, it operates by sequentially learning a latent representation that includes features such as word order, word usage, and local context. This tool, along with a user guide is available at: https://github.com/dvamossy/EmTract. Using EmTract, we explore the relationship between investor emotions expressed on social media and asset prices. We document a number of interesting insights. First, we confirm some of the findings of controlled laboratory experiments relating investor emotions to asset price movements. Second, we show that investor emotions are predictive of daily price movements. These impacts are larger when volatility or short interest are higher, and when institutional ownership or liquidity are lower. Third, increased investor enthusiasm prior to the IPO contributes to the large first-day return and long-run underperformance of IPO stocks. To corroborate our results, we provide a number of robustness checks, including using an alternative emotion model. Our findings reinforce the intuition that emotions and market dynamics are closely related, and highlight the importance of considering investor emotions when assessing a stock's short-term value.
An Alternative Framework for Time Series Decomposition and Forecasting and its Relevance for Portfolio Choice: A Comparative Study of the Indian Consumer Durable and Small Cap Sectors
One of the challenging research problems in the domain of time series analysis and forecasting is making efficient and robust prediction of stock market prices. With rapid development and evolution of sophisticated algorithms and with the availability of extremely fast computing platforms, it has now become possible to effectively extract, store, process and analyze high volume stock market time series data. Complex algorithms for forecasting are now available for speedy execution over parallel architecture leading to fairly accurate results. In this paper, we have used time series data of the two sectors of the Indian economy: Consumer Durables sector and the Small Cap sector for the period January 2010 to December 2015 and proposed a decomposition approach for better understanding of the behavior of each of the time series. Our contention is that various sectors reveal different time series patterns and understanding them is essential for portfolio formation. Further, based on this structural analysis, we have also proposed several robust forecasting techniques and analyzed their accuracy in prediction using suitably chosen training and test data sets. Extensive results are presented to demonstrate the effectiveness of our propositions.
Topological Components in a Community Currency Network
Transaction data from digital payment systems can be used to study economic processes at such a detail that was not possible previously. Here, we analyse the data from Sarafu token network, a community inclusion currency in Kenya. During the COVID-19 emergency, the Sarafu was disbursed as part of a humanitarian aid project. In this work, the transactions are analysed using network science. A topological categorisation is defined to identify cyclic and acyclic components. Furthermore, temporal aspects of circulation taking place within these components are considered. The significant presence of different types of strongly connected components as compared to randomized null models shows the importance of cycles in this economic network. Especially, indicating their key role in currency recirculation. In some acyclic components, the most significant triad suggests the presence of a group of users collecting currency from accounts active only once, hinting at a misuse of the system. In some other acyclic components, small isolated groups of users were active only once, suggesting the presence of users only interested in trying out the system. The methods used in this paper can answer specific questions related to user activities, currency design, and assessment of monetary interventions. Our methodology provides a general quantitative tool for analysing the behaviour of users in a currency network.
Multi-channel Autobidding with Budget and ROI Constraints
In digital online advertising, advertisers procure ad impressions simultaneously on multiple platforms, or so-called channels, such as Google Ads, Meta Ads Manager, etc., each of which consists of numerous ad auctions. We study how an advertiser maximizes total conversion (e.g. ad clicks) while satisfying aggregate return-on-investment (ROI) and budget constraints across all channels. In practice, an advertiser does not have control over, and thus cannot globally optimize, which individual ad auctions she participates in for each channel, and instead authorizes a channel to procure impressions on her behalf: the advertiser can only utilize two levers on each channel, namely setting a per-channel budget and per-channel target ROI. In this work, we first analyze the effectiveness of each of these levers for solving the advertiser's global multi-channel problem. We show that when an advertiser only optimizes over per-channel ROIs, her total conversion can be arbitrarily worse than what she could have obtained in the global problem. Further, we show that the advertiser can achieve the global optimal conversion when she only optimizes over per-channel budgets. In light of this finding, under a bandit feedback setting that mimics real-world scenarios where advertisers have limited information on ad auctions in each channels and how channels procure ads, we present an efficient learning algorithm that produces per-channel budgets whose resulting conversion approximates that of the global optimal problem. Finally, we argue that all our results hold for both single-item and multi-item auctions from which channels procure impressions on advertisers' behalf.
Performance Evaluation of Equal-Weight Portfolio and Optimum Risk Portfolio on Indian Stocks
Designing an optimum portfolio for allocating suitable weights to its constituent assets so that the return and risk associated with the portfolio are optimized is a computationally hard problem. The seminal work of Markowitz that attempted to solve the problem by estimating the future returns of the stocks is found to perform sub-optimally on real-world stock market data. This is because the estimation task becomes extremely challenging due to the stochastic and volatile nature of stock prices. This work illustrates three approaches to portfolio design minimizing the risk, optimizing the risk, and assigning equal weights to the stocks of a portfolio. Thirteen critical sectors listed on the National Stock Exchange (NSE) of India are first chosen. Three portfolios are designed following the above approaches choosing the top ten stocks from each sector based on their free-float market capitalization. The portfolios are designed using the historical prices of the stocks from Jan 1, 2017, to Dec 31, 2022. The portfolios are evaluated on the stock price data from Jan 1, 2022, to Dec 31, 2022. The performances of the portfolios are compared, and the portfolio yielding the higher return for each sector is identified.
Non cooperative Liquidity Games and their application to bond market trading
We present a new type of game, the Liquidity Game. We draw inspiration from the UK government bond market and apply game theoretic approaches to its analysis. In Liquidity Games, market participants (agents) use non-cooperative games where the players' utility is directly defined by the liquidity of the game itself, offering a paradigm shift in our understanding of market dynamics. Each player's utility is intricately linked to the liquidity generated within the game, making the utility endogenous and dynamic. Players are not just passive recipients of utility based on external factors but active participants whose strategies and actions collectively shape and are shaped by the liquidity of the market. This reflexivity introduces a level of complexity and realism previously unattainable in conventional models. We apply Liquidity Game theoretic approaches to a simple UK bond market interaction and present results for market design and strategic behavior of participants. We tackle one of the largest issues within this mechanism, namely what strategy should market makers utilize when uncertain about the type of market maker they are interacting with, and what structure might regulators wish to see.
