Financial credit risk is the risk of a financial loss that arises from a counterparty’s ability or inability to meet their obligations agreed within a financial contract. Credit risk analysis is a process that identifies the obligor and quantifies the amount to repay their borrowing well in advance. Analysts adopt either one or both of the following methods for credit risk modeling:
- Data mining and/or statistical learning approach
- Natural computing and mathematical modeling
The key metrics in credit risk modeling are credit rating (probability of default), exposure at default, and loss given default.
Typically, credit rating or probability of default calculations are classification and regression tree problems that either classify a customer as “risky” or “non-risky,” or predict the classes based on past data. Though both traditional statistical analysis and mathematical models are widely used in various scenarios in credit risk analysis, neural network models are more flexible and capable of modeling complex non-linear functions than classical statistical models like linear discriminant analysis and logistic regression. For example, a logistic regression model is easy to interpret due to its additive linear combination of inputs and weight, and is also adjusted by a learning algorithm. But it will give low accuracy results on complex non-linear relationships. However, for a neural network model using a logistic function, its higher number of hidden layers (H) allows it to learn complex non-linear relationships. In fact, a neural network with the value of H=0 is equivalent to the logistic regression.
A benefit of ANNs is that they do not require the functional relationship between dependent and independent variables to be explicitly specified. Since they are connectionist learning machines, the knowledge is directly embedded in a set of weights through the linking arcs among the processing nodes. Additionally, this negates the need for dropping off potentially autocorrelated or otherwise less reliable or important columns of data during the cleaning process prior to training the model. The ANN's weighting process simply assigns a lower weight to variables it predicts to be less important.
What is an Artificial Neural Network?
An artificial neural network (ANN) is a network of highly interconnected processing elements (neurons) operating in parallel. These elements are inspired by the biological nervous system, and the connections between elements largely determine the network function. A typical back propagation neural network consists of a 3-layer structure: input nodes, output nodes, and hidden nodes.
In our examples below, we’ll use financial variables as the input nodes and rating outcomes as the output nodes. The input layer is used for input training data, the hidden layers transform raw data into high-dimensional non-linear features, and the output layer classifies the data.
- Input: The input layer is composed of neurons, taking credit risk measurement indicators as the input vector. Score values of the qualitative indicators can be obtained with the help of expert knowledge. Divided by the highest score value, the obtained score values of the indicators should be converted to the values in the range of [0, 1] for computational convenience of the ANN model.
- Hidden: Low-level features from the raw input data are abstracted into high-level features through multiple hidden layers.
- Output: There is only 1 neuron in the output layer, representing the credit risk level. The value range of the credit risk level is [0, 1]. The higher the value, the higher the indicated risk level. The credit risk levels = (Very High, High, Average, Low, Very Low)
Now, let’s explore the practical value of ANNs by applying them to some well-known risk-rating scenarios in the financial industry below.
Credit Risk of Commercial Banks
In this scenario, a commercial bank has incomplete historical data due to lagged credit risk management. An ANN-based credit risk identification model can perform online learning as data is accumulated over time— a task unachievable by traditional credit risk measurement models.
The credit risk identification model is constructed based on an ANN Back Propagation (BP) algorithm. The ANN-based model is first trained on the algorithm according to historical data. Then, the model can be used to identify the credit risk of the debtor firms, providing decision supports to credit risk control.
A back propagation network typically starts out with a random set of weights. The network adjusts its weights each time it sees an input–output pair. Each pair is processed at two stages—a forward pass and a backward pass. The forward pass involves presenting a sample input to the network and letting activations flow until they reach the output layer. Standard back propagation is a gradient descent algorithm, which means the network weights are moved along the negative of the gradient of the error function. One iteration step of the algorithm can be written as:
W(t+1) = W(t) +μ(-∇E(t))
where W(t) is a vector of the weights at iteration step t, ∇E(t) is the current gradient of the error function E that is usually the sum of the squared errors, and μ is the learning rate.
To reduce training time, the learning algorithm updates the weights according to gradient descent with an additional momentum, β. [Note: This is one of the BP algorithms with adaptive learning rate and momentum.]
W(t +1) =W(t) +μ (−∇E(t)) + βΔW(t −1)
ΔW(t) =μ (−∇E(t)) + βΔW(t −1)
where ΔW(t) is the current adjustment of the weights, β is the momentum, and ΔW(t −1) is the previous change to the weights.
Momentum allows a network to respond not only to the local gradient, but also to recent trends in the error surface. Acting like a low-pass filter, momentum allows the network to ignore small features in the error. The learning rate and momentum are updated with optimized values during the training process though a conjugate gradient method with the exploitation of the first order and second order derivatives of μ and β. These derivatives can be calculated efficiently and conveniently during each back propagation iteration step. After training, the model can be used to analyze the data of a loan applicant firm for credit risk identification.
Credit Risk of individual customers
Most applicants are non-defaulting in credit risk assessments, and only a small number of applicants are defaulters. This means that the relatively small sample size of bad customers in credit datasets indicates that data classes are extremely imbalanced. Standard learning algorithms work better for majority class samples and usually perform poorly on minority (positive) class samples. This imbalance results in performance degradation, which is makes building a predictive model challenging. However, an ANN leveraging clustering and merging can achieve balanced data that can accurately judge whether a customer should be granted a loan or not.
To begin, use a clustering algorithm like k-means to cluster the subgroups. The problem is that standard learning algorithms are good for majority (negative) class samples, but usually perform poorly on minority (positive) classes samples. Considering this limitation, the model can be formulated into a two-step process: First, cluster majority class samples into k subgroups within which the samples of each subgroup must come from completely different data using the k-means algorithm on the training dataset. Second, merge the k subgroups of the majority class data and minority class data respectively into k balanced subgroups to create a diverse set.
Input: Raw dataset D, number of clustering center k
Output: Integration of deep neural network learning algorithms L
Credit Rating Analysis
Rating a company's credit is typically a very costly affair. It usually requires agencies like Standard & Poor or Moody’s to invest a large amount of time, effort, and human resources to perform a thorough analysis based on internal financial indicators as well as strategic and operational metrics. Not every company can afford such a huge investment. The overall objective of credit rating prediction is to build models that can extrapolate from past observations, providing an evaluation of credit risk at a much lower cost. Besides this output, the bond-rating modeling process itself also provides valuable information to users.
A bond is a debt security that constitutes a promise by the issuing firm to pay a given interest rate on the original issue price and to redeem the bond at face value at maturity. If a firm is not able make the promised interest payments, it will be in default. When modeling credit risk, it's important to remember that some of the factors that determine the likelihood of default are subjective. These could include the firm's willingness to pay or ability to perform in a difficult situation. Other factors that might be important are the general economic situation, changes in management, currency fluctuations, etc.
Obviously, the data traditionally used for a credit risk approximation could not possibly cover all the information that would typically be included in a process like S&P's, so the model must capture the hidden relationships in the input data that is actually available. ANNs capture these patterns more effectively than a conventional discriminant analysis. Using a model interpretation method allows us to capture this information from the neural network to better understand drivers of credit risk.
Model interpretation also allows us to further optimize the back propagation algorithm by selecting optimal sets of input financial variables following a procedure similar to that of a step-wise regression. During this process, we can observe the model for any prediction accuracy improvements until there are no additional gains from each back propagation steps.
Neural networks are invaluable tools for predicting credit risk in situations where statistical or machine learning methods fall short. It's important to emphasize, however, that these credit ratings are not meant to substitute an expert's analysis of a company's level of financial risk; rather, they should serve as an empirical complement to the process.