DR. Stanley njoku
PH.D IN BUSINESS ANALYTICS & DATA SCIENCE

The Solutions to Most Business Needs could be Embedded in their internal dataset

I am deeply passionate about uncovering solutions within large data sets and collaborating with stakeholders to enhance business outcomes!

Dissertation:

Using Machine Learning To Improve Clinical Trials for New Drug Development

The complexity surrounding new drug development has spanned over decades with a prolonged process ranging between 10-15 years, costing an average $2.8 billion with a fail rate of over 90% to develop a single new approved drug by the Food and Drug Administration (FDA) (Ekins et al., 2019).

For any drug to be approved by the FDA, it must undergo rigorous clinical trials phases that involve human subjects, from phase I through Phase III. Phase II of the clinical trials accounting for the most significant number of failures (Vijayan, Kihlberg, Cross, & Poongavanam, 2022). 

Many clinical trials have been withdrawn, suspended, terminated or delayed due to insufficient recruitment of human subjects to show efficacy and toxicity. To successfully demonstrate efficacy and toxicity and gain FDA approval, sufficient and convincing safety data must be available. However, recruiting sufficient human objects in clinical trials poses severe challenges for the pharmaceutical industry.

Approximately 80% of clinical trials failed to meet their recruiting timeline costing between $600,000 and $8 million per day according to experts.

Can Machine Learning Techniques improve Clinical Trials by reducing recruitment timeline?

Prolonged recruitment and not meeting the designed enrollment population of human subjects has resulted in most clinical trials being terminated, withdrawn, or suspended. Other factors such as lack of fundings may also resolve around retaining and attracting human subjects.

The numbers of human subjects needed to demonstrate statistically significant at predefined level of efficacy is crucial (Fogel D. B., 2018). More than half of clinical trials expenses are associated with delays due to prolonged recruitment (Cai, T. etc. 2021). 

Each clinical protocol must provide an estimated duration of the study and the maximum number of human subjects to measure effectiveness and safety and demonstrate statistically level of efficacy (Fogel D. B., 2018). 

  • Can machine learning help reduce recruitment timeline?
  • Why many clinical trials are failing to meet their recruitment targets
  • What is the impact selecting the wrong clinical trial sites?
  • Are clinical sites targeting the right patients?

This Quantitative study research Extracted Numerous datasets from various databases for exploratory and logical insights

The qualitative approach analyzed the data for insights, while the quantitative method applied various machine learning techniques.

PHD – Research Data Summary Table

– Total Records (unique): ~426.6 Millions
– Total Words: ~9.44 Million
Data SourceDataset DescriptionVolumePurpose / Insight
ClinicalTrials.govTotal Clinical Trial Studies (2012–2022)165,630 recordsBenchmark dataset for overall trial performance analysis
ClinicalTrials.govFailed Interventional Trials23,167 recordsUnderstand causes of failure (terminated, withdrawn, suspended)
ClinicalTrials.govRecruiting Clinical Trials (as of May 14, 2023)64,685 recordsIdentify trials currently seeking participants
— Interventional TrialsFocused recruitment dataset47,128 recordsUsed for phase-level recruitment analysis and sponsor performance
— Observational TrialsExcluded from deep analysis17,557 recordsNot used for main recruitment insights
PhysioNet (Beth Israel ED)Emergency Department Admissions~425,000 patientsUsed to assess feasibility of patient recruitment via real-world hospital data
PhysioNet (All Files)Combined detailed ED records (e.g., diagnosis, vitals)361 million recordsBenchmarking, hypothesis testing, and diversity/inclusion evaluation
Social Media (TNBC Foundation)Triple Negative Breast Cancer Foundation Text Data8,625,000 wordsAnalyzed for patient concerns and community voice
Social Media (ACS)American Cancer Society Text Data815,374 wordsSupplementary social insights

A High-Quality research Paper Could Unfold New innovative thinking.

A research study is a systematic empirical investigation of a specific topic or issue.  The purpose of a research study is to develop or contribute to generalizable knowledge on a specific topic.

Dissertation Overview and chapters

The aim of this research is to explore ways machine learning techniques could reduce clinical timeline in the recruitment process of new drug development. Mixed method approaches will be used with a diversified method, combining inductive and deductive thinking, and offsetting limitations of exclusively.

chapter I: INTRODUCTION
  • Background of Study
  • Problem Statement
  • Etc.
CHAPTER II: LITERATURE REVIEW
  • Title Searches
  • Articles
  • Etc.
CHAPTER III: METHOD
  • Research Method & Design..
  • Population, Sampling, Data..
CHAPTER IV: RESULTS
  • Pilot Study
  • Findings
CHAPTER V: FINDINGS & RECOMMENDATIONS
  • Limitation, Findings…

Diversity in Clinical Trials

Underrepresented of minority groups, pregnant women, children and elderly is a major challenge in clinical trials (Ramamoorthy et al., 2022).

“The misrepresentation of these groups could be hindering innovation and opportunities for new discoveries that could extend lives of the impacted patients.”

Experts believe that Most companies lose 20 – 30% percent in revenue every year due to Inefficiencies

In this digital era any business slow in adapting digital solutions and innovative strategies are most likely to continue to lose 20-30% in revenue annually due to inefficient workflows. Although forward thinking businesses are simply benefiting through adapting Agile methodology for successful product deliveries. Process inefficiency will continue to be the biggest threat to companies without robust and innovative strategies. Harvard Business Review, shows that 60 percent of companies experience an increase in revenue and profits after using an Agile approach.

“You can’t solve a problem on the same level that it was created. You have to rise above it to the next level”

aLBERT EINSTEIN

German Mathematician & Physicist

“Progress is impossible without change, and those who cannot change their minds cannot change anything.”

– George Bernard Shaw

Popular Data Science questions

The most common Data Science questions for professional data scientists.

Data Science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from structured and unstructured data. It combines aspects of statistics, mathematics, computer science and domain expertise to analyze data and make data-driven decisions. The goal of data science is to uncover patterns, correlations, and other insights that can inform decision making and drive business value.

The goal of data science is to extract insights and knowledge from data through the use of scientific methods, algorithms, and techniques. It aims to uncover patterns, relationships, and trends that can inform decision making, support problem-solving, and drive business value. The ultimate objective of data science is to turn data into actionable information that can drive business growth, improve decision-making, and inform strategy.

1. Strong analytical and problem-solving skills: The ability to break down complex problems and find solutions using data.

2. Technical expertise: Knowledge of programming languages (e.g. Python, R), statistics, machine learning, and databases.

3. Business acumen: Understanding of the business domain and the ability to translate data insights into business value.

4. Communication skills: The ability to effectively communicate findings to both technical and non-technical stakeholders.

5. Creativity: Creativity goes far beyond the obvious applications in communication and project design. Of course, a data scientist who can create an attractive and easy-to-grasp report or visual out of results that would take a couple of master’s degrees to fully understand is a skill with enormous returns.

Railways: The chance to utilize data science in the railway industry is colossal. The railway industry is now delivering huge measures of data because a significant number of the current frameworks make log documents.

Real Estate: After railways, the real estate industry offers a high number of jobs to data scientists.

Food Services: From field to fork, data science plays a very important role in the food industry.

Pharmaceutical: Data science in pharma can help pharmaceutical businesses to reduce the cost and speed up clinical trials by identifying and analyzing various data points: such as the participants’ demographic and historical data, remote patient monitoring data, and by examining past clinical trial events data, just to mentioned a few.

Technology: Companies such as Google, Amazon, and Facebook use data science to improve their products and services.

Finance: Banks, insurance companies, and other financial institutions use data scientists to analyze market trends, detect fraud, and make investment decisions.

Business acumen is a critical aspect of data science, as it helps data scientists understand the context in which they are working and the impact their findings may have on the business. A data scientist with strong business acumen has a deep understanding of the industry, the company, and its goals, which enables them to translate complex data insights into actionable recommendations that drive business value.

Additionally, having business acumen helps data scientists to communicate effectively with stakeholders, including business leaders and decision makers, and to make data-driven recommendations that are aligned with the company’s objectives. It also helps data scientists to prioritize tasks and focus on the most impactful problems to solve, ensuring that their work has the maximum impact on the business.

In short, business acumen is an essential quality for data scientists as it helps them to bridge the gap between technical expertise and business needs, and to effectively communicate their insights to stakeholders.