Recent advances in machine learning and natural language processing (NLP) enable machines to better understand human languages. E-mail systems now routinely warn us when an email we receive may be a spam, suggest a quick response to an email, correct our grammatical errors, and suggest words or an entire sentence to an email we are writing. The main advantage machines have over humans is that they can quickly process huge amounts of data. In a new paper, we examine the extent to which machines can help us understand the functioning of financial markets.
Public companies provide detailed information about their performance and outlook to shareholders annually. They file these reports as Form 10-K with the SEC, which makes them publicly available on its Edgar website. 10-Ks include many tables showing financial reports and other quantitative information. They also include discussions of the performance and outlook for the firm and its industries, risk factors, competition, fund raising, etc. Are these qualitative discussions informative beyond the quantitative information in the filings? Since the start of Edgar in 1994, U.S. public companies have filed about 200,000 10Ks as of August 2019, each of them about 100 or more pages long. It is impossible to manually read all these filings to answer this question.
Currently, the most popular approach to analyzing the sentiment in a large volume of text is a word-based approach, which counts the number of positive, neutral, and negative words out of a subset of words (called a dictionary) used in a text dataset. While this approach has been reasonably successful is capturing negative sentiment, it is unable to reliably identify positive sentiment, as Loughran and McDonald (2016) discuss. An example from a 10-K illustrates the difficulty: “For these and other reasons, these competitors may achieve greater acceptance in the marketplace than our company, limiting our ability to gain market share and customer loyalty and increase our revenues.” While the words in italics are positive, the tone of the sentence is clearly negative. Moreover, while this method has the advantage of simplicity, its overall accuracy is only about 45 percent. Another approach, called Naïve Bayes classification (NBC), measures sentiment based on sentences. While achieving higher accuracy (about 67 percent to 77 percent) than word-based measures, its ability to detect positive sentiment is also limited because it does not take into account the relationship between words and the sequential nature of the text in a sentence. After all, words matter only in context.
This is where machine learning comes in. We use deep learning, which is a machine learning technique, and measure the sentiment of the text in 10-Ks as the percentage of the sentences with a positive, neutral, or negative tone. This method increases the accuracy of sentiment measurement to 90 percent. Here is the idea behind our method. We first use all 220 million sentences from all 10-Ks to train a model to understand the relationship between words. This revolutionary NLP method is known as Word2Vec and was introduced by Mikolov ((2013a), (2013b)). Next, we manually label 8,000 sentences and train a deep neural network to classify the sentiment in a sentence into positive, negative, and neutral. This approach is akin to teaching a robot how to walk by walking it for a while. Our measure of negative (positive) sentiment is the number of negative (positive) sentences divided by the total number of sentences in a 10-K filing. Using this more accurate sentiment measure, we next examine whether sentiment has information content.
We start by examining the relation between our sentiment measures and the reaction of stock prices and trading volumes to the 10-K filing. Our results show that positive (negative) sentiment predicts higher (lower) abnormal return over days (0, +3) around the 10-K filing date, i.e., the filing period. Both sentiment measures also predict higher abnormal return over event windows of up to one month after the filing period. This finding suggests that the market underreacts to positive sentiment and overreacts to negative sentiment in the 10-K filing during the filing period. Moreover, both sentiment measures are significantly related to abnormal trading volume around the filing date. Negative sentiment reflects more concerns and uncertainty about the future, which results in greater divergence of opinions among investors and therefore leads to higher trading volume, and vice versa. We find that investors are more responsive to negative sentiment than to positive sentiment.
We next examine the relation between sentiment and future firm fundamentals. We find that positive sentiment predicts higher return on assets and higher operating cash flow over the next year, while negative sentiment predicts lower ROA and OCF. The economic significance of negative and positive sentiments are comparable to each other, suggesting that positive sentiment is nearly as informative regarding future profitability as negative sentiment.
We next evaluate the informativeness of the sentiment in the 10-K filing regarding future firm policies. An unfavorable business environment or greater uncertainty about a firm’s future prospects should be reflected in higher negative sentiment in the filing. Managers of such firms are likely to increase cash holdings in the future to be prepared for potential losses and unexpected costs. Accordingly, our results show that negative sentiment predicts higher future cash holdings. On the other hand, positive sentiment can reflect a strong operational and financial situation that requires lower future cash holdings. It can also reflect higher growth opportunities, followed by greater spending on new projects and expansion that results in lower future cash holdings. Accordingly, we find that positive sentiment predicts lower future cash holdings. Interestingly, the estimated coefficient of negative sentiment is approximately three times larger than that of positive sentiment in absolute value, a result that suggests that managers are more responsive when performance and outlook are weak than when they are strong.
Our finding that positive sentiment is related to higher cash flow from operations triggers a natural question: What is the extra cash used for? To investigate this issue, we examine the relationship between sentiment and future use of leverage. We find that an increase in positive sentiment predicts an economically significant decrease in leverage over the next period, suggesting that the extra cash generated in the future is used to reduce leverage. On the other hand, negative sentiment predicts higher leverage, but the magnitude of this relation is much smaller than that of positive sentiment. This asymmetric relation suggests that poorly performing firms have a harder time raising additional debt.
Finally, motivated by Cohen, Malloy, and Nguyen (2018), we examine whether changes in sentiment are informative. We repeat our analyses using changes, instead of levels, of sentiment as independent variables. We find that an increase in positive sentiment predicts higher abnormal return at the 10-K filing date. While the coefficient of the change in negative sentiment is negative, it is statistically insignificant. Moreover, changes in sentiment predict future profitability, cash holdings, and leverage. The results for changes in positive sentiment are much stronger than for changes in negative sentiment, both statistically and economically.
Overall, we find persuasive evidence that, in contrast to most prior studies, positive sentiment in 10-K filings is informative and that stock prices and trading volumes react to it. Positive sentiment also predicts future firm fundamentals and policies. The effects of positive sentiment and negative sentiment in corporate filings are often asymmetric, which implies that using a net sentiment measure advocated by prior studies would result in loss of information. More importantly, our findings suggest that employing this state-of-the-art technique for textual analysis can provide more reliable measures of sentiment. Word2Vec and the deep learning classifier can be shared and used easily, and researchers can improve the accuracy of the classifier by using their own labelled sentences, which would substantially reduce the cost of using this approach. Finally, in addition to measuring general sentiment in other sources of textual data in finance, this method can be used for tasks such as topic-specific content analysis, e.g., classifying text into topics such as competition, innovation, or financial constraints and to measure the tone within each topic.
Cohen, Lauren, Christopher Malloy, and Quoc Nguyen. Lazy prices. Working paper no. w25084. National Bureau of Economic Research, 2018.
Loughran, Tim, and Bill McDonald. Textual analysis in accounting and finance: A survey. Journal of Accounting Research 54, no. 4 (2016): 1187-1230.
Mikolov, Tomas, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013a).
Mikolov, Tomas, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems, (2013b): 3111-3119.
This post comes to us from Mehran Azimi, a PhD candidate at the University of Alabama’s Culverhouse College of Business, and Anup Agrawal, a professor there. It is based on their recent article, “Is Positive Sentiment in Corporate Annual Reports Informative? Evidence from Deep Learning,” available here.