Phase Three Quant Work Summary and Next Steps: Complexity and Cost Control

Phase Three Quant Work Summary and Next Steps: Complexity and Cost Control

Phase three completed end-to-end development and testing of the multi-factor quant system, and the strategy is now preparing to go live. The next stage will focus on strengthening diversified alpha capabilities and exploring diversified strategies.

Phase Three Review

Paper Trading

During phase three, the full pipeline was automated and one paper-trading account has been running successfully. At this point, the core functions are basically stable.

End-to-end automation In phase two, factor production had already been routinized in practice. Over the past few months there were occasional data interruptions, but after investigation and fixes, the system is now running smoothly. On top of the factor production pipeline, this phase completed management of the best model and portfolio-optimization parameters, as well as online training, inference, and trade-task generation. The final piece was programmatic trading, where the system was connected to the Xuntou GT platform to support account management and asynchronous execution of daily trade tasks. Programmatic trading turned out to be a deep pit: in the early days of paper trading, new problems appeared almost every day in the literal sense, and only after one or two weeks of continuous fixes did it become stable.

Strategy deployment During phase three, especially in the development of trade execution algorithms, there was very little external material worth referencing. This process is the engineering step that turns theoretical strategy returns into realized returns, and for two reasons it is rarely written up in detail. First, much of the engineering work is not especially sophisticated and is not something people bother to publish. Second, any institution that has already deployed such systems keeps the details strictly confidential. In this phase I also completed a large amount of tedious but indispensable work covering odd-lot handling, STAR Market-specific rules, regulatory constraints, daily price limits, and other special cases.

Removing the Dross and Keeping the Essence

When developing proprietary strategies, there is a natural impulse to add every powerful component one can imagine. But excess is harmful. Near the end of phase three, every layer of the strategy went through a simplification process.

The upper bound of strategy performance and the constraint of complexity What determines the upper bound of a strategy? Any system or entity has intrinsic limits. From a macro perspective, an alpha strategy is a process that converts information into decisions: one end is the data stream, the other is the trade instruction. Once the data is fixed, unlimited gains cannot come from endlessly increasing complexity. A strategy is not an end-to-end learning process, but like all machine learning systems, improving model performance on fixed data is ultimately just pushing closer to the limit implied by that data. Data is scarce, while model complexity tends to proliferate under human greed. So when a strategy fails out of sample, the cause is almost certainly overfitting rather than a lack of sophisticated architecture. At the micro level, adding more features, deeper networks, or more risk factors almost always makes the market look better explained on the training and validation sets, but some invisible wall is eventually hit in backtests and live trading.

The lower bound of strategy survival and cost advantage When markets become more efficient, what kind of strategy can still survive? I believe the answer must be framed in terms of cost. In information security, no ciphertext is safe if enough compute is applied, but if the cracking cost exceeds the value of the plaintext, the information is effectively safe. A strategy works the same way. Cost advantage is, in essence, an advantage built through technical strength: develop strategies cheaply in specialized niches, compress industry profit margins, and eventually deprive competitors of the incentive to continue. From this angle, some performance decay is not necessarily bad. In the end, technical capability determines research efficiency and execution efficiency, and therefore determines whether a strategy lives or dies. The right path is to use relative technical advantage to carve out a defensible niche. For me personally, the next direction is not to confront similar strategies head-on through larger data, larger models, or self-developed execution algorithms, but to keep the technical architecture lean, fill in the missing basics, and continue research where I believe my own advantages lie.

Errors and Lessons

Overall, phase three was completed less efficiently than expected, mainly for two reasons.

First, the time spent researching a self-developed execution algorithm was a strategic and tactical mistake, costing roughly four weeks of labor.

  • Before the project began, I failed to conduct a comprehensive survey of the conditions and performance of algorithmic execution services already available in the market, and instead started development directly from a flashy academic paper. Strategically, focusing on alpha generation while outsourcing execution would have offered a better cost structure.
  • Tactically, choosing an end-to-end multi-agent deep reinforcement learning framework exceeded both the complexity ceiling of the available intraday data and my own engineering capacity.

Second, the effort spent recruiting and working with junior researchers produced less than expected and cost about two weeks of labor. Although external resources and personnel management are affected by many factors and their cost-output ratio is hard to control, I still regard the delay to the main line as partly my own responsibility.

  • Even when effort and output are proportional, opportunity cost still matters, so time investment should be capped and the core work should come first.
  • Equality applies to human dignity, but in technical work, treating every idea as equally worth discussing harms overall efficiency. To build and maintain an excellent system, a Linus-style benevolent dictator is a reasonable model.
  • The conflict between personal growth and transactional work is unavoidable.

Phase Four Plan

Fundamentals

Beyond price and volume information, fundamentals are an important part of the strategy and can provide a great deal of signal. In the next phase, the plan is to complete the fundamental factor set in two ways. First, identify, summarize, and build a relatively unified framework for computing fundamental factors. Second, introduce earnings forecasts so the strategy can react quickly to earnings surprises.

Relational Momentum

The theoretical part of relational momentum has already been organized elsewhere, so it will not be repeated here. At this point the factor’s profitability has been preliminarily verified. In phase four, the goal is to build the algorithmic and engineering framework for relational momentum with graph methods, so that relationships involving tens of millions of edges across the cross-section can be handled efficiently and in a unified manner. Once the core framework is in place, existing research reports and papers can be used to instantiate and mass-produce specific relational momentum factors.

End-to-End Models

I discussed end-to-end return models previously, but never implemented them. The root cause was still the complexity constraint: high-order features plus complex models did not work well, while the public literature, lacking rich factor libraries, mostly follows the opposite pattern of low-order features plus complex models. In the next phase, the role of such models will be reframed as “feature extractors” rather than final return predictors. Their purpose will be to partially replace handcrafted factors within the differentiable space of daily and intraday data.

Event-Driven Strategies

Events are a more intuitive source of excess return and are particularly suitable for proprietary strategies.

For event testing, the plan is to combine a counterfactual framework with factor models, which will be discussed in a future article. In terms of turning research into production, the first path is to build standalone event-driven strategies. Under an event-driven worldview, the focus is on specific investment opportunities rather than relative value across the cross-section, so such strategies can complement multi-factor approaches. The second path is to convert broadly covered events into event factors.

Longer-Term Outlook

From a high-level perspective, the most cost-effective directions are new markets and new strategies. On market selection, crypto still looks relatively inefficient and is worth trying. In A-shares, I also look forward to converting event-driven research into live event strategies. Because events occur at random times, the main challenge from theory to execution is position management and response speed. Over the longer run, an A-share-plus-crypto core could balance beta exposure across regions and strategy types, and other high-quality strategies could later be layered on top as part of portfolio allocation.

At the technical level, I am optimistic about applying LLMs to text analysis. LLMs offer a cheap supply of intelligence, and using them to replace low-value manual analysis is a natural direction for quantitative investing. Once foreign clusters at the hundred-thousand-GPU scale are built, model capability will likely take another step forward. When that happens, the plan is to integrate open-source advances more fully into the strategy stack.

More concretely, the future technical edge should come from the following:

  • Polars-centered high-frequency data processing
  • graph-based discovery of cross-sectional relational momentum
  • testing sparse event alpha and reacting quickly
  • large-scale text analysis centered on LLMs

Phase Three Quant Work Summary and Next Steps: Complexity and Cost Control

https://en.heth.ink/Summary2024-2/

Author

YK

Posted on

2024-08-23

Updated on

2024-08-23

Licensed under