Critically analyze practical difficulties that arise when implementing scorecards; understand the cross-fertilization potential to other business contexts (e.g. fraud detection, CRM).

This assessment relates to the following module learning outcomes:

A. Knowledge and Understanding

A1. Understand the potential of KDD and data mining for developing scorecards.

B. Subject Specific Intellectual and

Research Skills

B1. Work with software to develop credit scoring solutions; develop a scorecard using data

mining techniques.

C. Transferable and Generic Skills

C1. Critically analyze practical difficulties that arise when implementing scorecards; understand

the cross-fertilization potential to other business contexts (e.g. fraud detection, CRM).

Coursework Brief:

Question 1 (65 marks)

Banks play a crucial role in market economies. They decide who can get finance and on what terms and can make or break investment decisions. For markets and society to function, individuals and companies need access to credit.

Credit scoring algorithms, which make a guess at the probability of default, are the method banks use to determine whether or not a loan should be granted. This requires banks to improve on the state of the art in credit scoring, by predicting the probability that somebody will experience financial distress in the next two years.

Historical data (cs-training.csv) are provided on 150,000 borrowers. The following variables are available to you:

Variable Name	Description	Type
Serious Dlqin2yrs	Person experienced 90 days past due delinquency or worse	Y/N
Revolving Utilization Of Unsecured Lines	Total balance on credit cards and personal lines of credit except real estate and no instalment debt like car loans divided by the sum of credit limits.	percentage
age	Age of borrower in years	integer
Number Of Time 30-59 Days Past Due Not Worse	Number of times borrower has been 30-59 days past due but no worse in the last 2 years	integer
Debt Ratio	Monthly debt payments, alimony, living costs divided by monthly gross income.	percentage
Monthly Income	Monthly income	real
Number Of Open Credit Lines And Loans	Number of Open loans (instalment like car loan or mortgage) and Lines of credit (e.g. credit cards	integer
Number Of Times 90 Days Late	Number of times borrower has been 90 days or more past due	integer
Number Real Estate Loans Or Lines.	Number of mortgage and real estate loans including home equity lines of credit	integer
Number Of Time60-89 Days Past Due Not Worse	Number of times borrower has been 60-89 days past due but no worse in the last 2 years.	integer
Number Of Dependents	Number of dependents in family excluding themselves (spouse, children etc.)	integer

The goal of Question 1 is to build a model from training dataset that banks can use to help make the best financial decisions on borrowers in testing dataset (cs-test.csv).

1.1 Carefully pre-process the dataset by considering the following activities:

Exploratory data analysis.
Missing value handling (if any). Marks will be discounted by just replacing by a value, a correct study of missing values is necessary.
Outlier detection and treatment (if any). Marks will be discounted by just eliminating or replacing by a value without justification, a correct study of outliers is necessary.

1.2 Build a credit scoring model in which SeriousDlqin2yrs is used as a target (default) and report the following:

What method do you use?
Why you use this method?
Discuss your results.
The most important variables
The impact of the variables on the target
The performance of the model. Use various performance metrics and discuss their relationship if any.
What do banks win and lose by doing this?

In terms of software, use SAS Enterprise Miner or anything else (e.g., Python, R and so on). Carefully report the various steps of your methodology and discuss your results in a rigorous way!

Question 2 (35 marks)

Find an academic or business paper published in 2019 or later discussing a real-life application of data mining or credit scoring. It is important that the case considered is a real-life case and not an artificial one. Some suggested journals are:

Management Science
Operations Research
INFORMS Journal on Computing
INFORMS Journal on Applied Analytics
Journal of Machine Learning Research
European Journal of Operational Research
ICDM (The IEEE International conference on data mining)
NeurlPS (Conference on Neural Information Processing Systems)
KDD (ACM SIGKDD Conference on Knowledge Discovery and Data Mining)

Once you have found an appropriate paper, report the following in separate sections:

Title, authors and complete citation (journal name, book title, issue, year, …)
The data mining problem considered
The data mining techniques used
The results reported
A critical discussion of the model and results (assumptions made, shortcomings, limitations, …)

Make sure you demonstrate that you understand what the article is all about!

Word Limit: +/-10% either side of the word count (see above) is deemed to be acceptable. Any text that exceeds an additional 10% will not attract any marks. The relevant word count includes items such as cover page, executive summary, title page, table of contents, tables, figures, in-text citations and section headings, if used. The relevant word count excludes your list of references and any appendices at the end of your coursework submission.

You should always include the word count (from Microsoft Word, not Turnitin), at the end of your coursework submission, before your list of references.

Title/Cover Page: You must include a title/ cover page that includes: your Student ID, Module Code, Assignment Title,

Word Count. This assignment will be marked anonymously, please ensure that your name does not appear on any part of your assignment.

References: You should use the Harvard style to reference your assignment. The library provide guidance on how to reference in the Harvard style .

Brilliant Essay Help

Critically analyze practical difficulties that arise when implementing scorecards; understand the cross-fertilization potential to other business contexts (e.g. fraud detection, CRM).

Make your order right away

Confidentiality and privacy guaranteed

satisfaction guaranteed