Benefits of DAIP
Diagnostic accuracy and efficiency
While all types of AI, including those for pathological analysis purposes, are still relatively new and require further development, the use of DAIP demonstrates promising potential.1,5–12 DAIP diagnoses are faster and have reduced turnaround times due to the use of technology.14 Another study showed that, in a diagnostic setting, the accuracy rate of AI algorithms was 99% in identifying breast cancer metastasis in lymph nodes, compared with 81% for pathologists.15 These results were confirmed by other studies and in other cancers, including colon cancer, head and neck cancer, and melanoma metastases.5 Other studies have shown that AI algorithms exhibit high sensitivity and low error rates in determining whether a disease is benign or malignant. Furthermore, AI algorithms can incorporate human feedback to improve classification performance and reduce the number of slides needed for retraining (“Human-in-the-loop”).11
DAIP can also improve pathology workflow and pathologists’ efficiency.1,16,17 For example, DAIP can help quickly identify areas of interest on slides and assess markers for immunotherapies.18–20 It can also facilitate workload (re)allocation among pathologists.11 Moreover, integration of DAIP with electronic medical record systems will make information retrieval much faster than using glass slides and reduce the step of checking patient identification.1,11 Finally, a DAIP-enabled copilot may assist pathologists with differential diagnoses and in ordering the appropriate immunostains.21
In addition to image analysis, some large language models (LLMs) can help predict results based on clinical data (input). For example, an machine learning (ML) model named “survival quilts” can use the SEER database to predict cancer-specific mortality in patients with non-metastatic prostate cancers.6 LLMs can be continuously trained and learn from the data they are fed, suggesting that their predictive capabilities can improve over time as more data is provided.22 This adaptability proves that LLMs used for diagnostic and predictive purposes can remain relevant as information and practices evolve.
Specifically, LLMs can help extract important data from cancer reports to generate synoptic reports according to cancer protocols.23–25 For researchers, LLMs can be used to extract relevant information from free-text pathology reports for annotation and data curation.23 They can also be used to produce patient-centered pathology reports, helping to better inform and educate patient.26 The segmentation of unstructured pathology reports will become possible using LLMs, which will help accurately extract critical clinical information and classify nodal statuses.24,27 Thus, these applications of LLMs can reduce workloads and improve work efficiency for clinicians and researchers.
Convenience and accessibility
DAIP is particularly attractive to clinical laboratories due to its convenience and accessibility. Because of its digital interface, collaborations between pathologists are made much more accessible. Pathologists can share cases (i.e., digital slides) remotely and collaborate with others more effectively. This advantage also extends to work schedule flexibility. For example, pathologists who wish to work remotely can have the opportunity to do so. DAIP also promises greater productivity in the workplace, as algorithms are quicker to diagnose and do not experience fatigue like humans do. Due to this reduced turnaround time, AI algorithms are useful for triaging cases. Indeed, AI algorithms can quickly diagnose cases that require a fast turnaround, such as lymph node metastasis in tumors.28
Equity and labor shortage
DAIP has the potential to bridge the gaps between smaller (often less-funded) and larger institutions, as well as between rural and urban locations. If implemented, remote areas with fewer staff or less specialized staff would be able to access the same quality of diagnostic services as less remote places with more staff or specialized staff.
An example of digital pathology (DP) benefiting rural areas is demonstrated through the Eastern Quebec Telepathology Network, an extensive implementation of DP starting in 2011, which encompasses 22 hospitals serving around 1.7 million patients.11 A 2018 study showed a reduction in two-stage surgeries and patient transfers to more urban centers. Service breaks and diagnostic delays were also reduced.11
In addition to benefiting rural and smaller institutions, DAIP may also help alleviate the overall labor shortage in the pathology workforce. Due to increased cancer incidence, an aging population, and increasingly complex cancer diagnoses, there is a growing demand for pathologists. However, the number of practicing pathologists is decreasing and is projected to shrink by 20% over the next two decades.16 Therefore, AI can reduce the burden on pathologists by accelerating the rate of certain tasks.16
Cost
A study demonstrated that when Memorial Sloan Kettering Cancer Center’s pathology department in the U.S. implemented DP in 2015, there was a 93% decrease in glass slide requests, and the use of whole-slide imaging (WSI) increased to around 23,000 slides per month by 2017.11,17 Their estimated savings through the use of DP was around $267,000 per year, and taking implementation costs (WSI setup and maintenance) into account, the estimated year of breaking even would be 2021 (seven years after implementation).11,17 However, our literature search did not identify any follow-up publications regarding their anticipated savings. Additional cost savings are made possible through a customizable calculator developed by a working group of 30 experts.29
Pitfalls of DAIP
Legal and ethical concerns
The pitfalls of DAIP pathology are tied to its reliance on AI technology. An AI-based diagnosis is only as good as the data it is trained on (“garbage in, garbage out”). Many of these AI algorithms are new and still in the early stages of development and application, meaning they will improve over time as they are used more frequently.15 In addition, AI performance may be biased or unfair to certain populations due to the inherent bias in the datasets used for training.30,31 This ethical concern stems from biases in the healthcare system. A study found that using a deep learning-based computational pathology system with common modeling approaches, performance gaps were observed between White and Black patients, with differences of 3.0% for breast cancer subtyping, 10.9% for lung cancer subtyping, and 16.0% for IDH1 mutation prediction in gliomas.31 The AI itself is not inherently biased, but if it is only fed biased data, its performance will reflect that bias. Therefore, addressing biases in AI will be difficult, if not impossible, as it may require a drastic change in practices related to data collection, modeling, and AI application. Nonetheless, addressing the AI fairness issue must be the starting point.
In addition to the risk of complacency and perpetuation of disparities, another major concern of AI in pathology is that it could reduce the pathology job market and the number of pathologists needed. Given its demonstrated accuracy in previous studies, AI may play an increasingly important role in diagnostic pathology, as it does not become fatigued, demand salary increases, or need breaks. However, pathologists also contribute significantly to clinical activities beyond diagnostics. They actively participate in multidisciplinary tumor boards and can solve complicated and rare diseases that ML models cannot reliably be trained on. Furthermore, pathologists play a vital role in consulting patients and clinicians on complicated or ethically challenging cases, which DAIP cannot do. Lastly, the interventional procedures performed by pathologists, such as fine needle aspiration, cannot be replaced by DAIP.
Because AI is still relatively new, it lacks federal or state regulation. Therefore, there are numerous ethical and legal concerns associated with its use.9,22,32 It is particularly concerning that we do not know who should be held primarily responsible when a DAIP algorithm makes a wrong diagnosis or influences a pathologist to make an incorrect decision. Similarly, who will be responsible when a DAIP algorithm makes a correct diagnosis, but the pathologist incorrectly overrides it? A more interesting question is whether and when pathologists will be able to trust AI algorithms more than themselves (and their peer pathologists) as AI improves.
Unintended and hidden costs
Theoretically, DAIP can save money by increasing productivity and containing labor costs. However, several costs were often overlooked. For example, the guidelines of the College of American Pathologists require keeping the glass slides for at least 10 years. Therefore, digital storage costs should be considered in addition to physical storage.11 Moreover, digital storage is nowadays billed at a preset interval (commonly per year) and will incur costs perpetually unless the institution decides otherwise. Furthermore, the capital costs of implementing DAIP may be prohibitive or burdensome for a small practice or hospital with a limited capital budget. Should a small practice or hospital use bonds or other long-term financial instruments to afford the DAIP equipment, it may take decades to pay off the costs, and by then, the equipment may have lost maintenance support.
Linked to the potential medicolegal concerns, pathologists using DAIP may face higher malpractice insurance premiums than those who do not. The costs of cybersecurity and computing hardware should also be considered, along with the labor costs of additional quality assurance programs. Additionally, the labor costs involved in conducting cost-analysis for implementing DAIP should be taken into account. Therefore, many unintended and hidden costs are associated with implementing DAIP.
Challenges in digital implementation
While equity issues can be addressed through the implementation of DAIP, some places do not even have the funds to fully implement it. There are many steps in the implementation process, including institutional approval, cost analysis, procurement of AI tools (which requires time and money for training and research), configuration, and adoption.
In addition to the numerous implementation steps, DP has many requirements for its maintenance and use. It needs sufficient data storage capacity, reliable networks (connectivity), and high-resolution scanners.28 WSI requires dedicated infrastructure and information technology support to minimize downtime. Specifically, because WSI allows for remote viewing, certain latency and bandwidth requirements must be met (usually more than 10 megabits per second). Additionally, depending on the institution, color calibration may be necessary for optimal image analysis.11 The use of DAIP also requires lab technicians to modify standard processes, such as performing thinner sections, placing sections closer together for optimal scanning (n = 15, 20.8%), and dividing large specimens (n = 14, 58.1%).33 Moreover, if the manufacturer of the WSI equipment has not validated the slides, institutions must validate them themselves with a certain number of cases (CAP guidelines require a minimum of 60 slides for hematoxylin and eosin staining and at least 20 slides for each auxiliary technique).11 In addition to the extra labor costs and time, images are slow to load and difficult to view on digital slides. Because of these issues, an Asian study group had to hire technicians to help ameliorate the technological difficulties.28
Specifically, the pure cost of DAIP combined with its tricky implementation comes at a significant price. A study focusing on the implementation of DAIP in labs in Europe and Asia found that the majority (63%) of the surveyed labs spent between $100,000 and $1,000,000 on implementation. However, some institutions spent between $1,000,000 and $5,000,000, and one institution (2.2%) spent more than $5,000,000.33 It is also important to note that these numbers are likely to be higher in the United States due to inflation and the higher cost of labor.
Challenges in AI implementation
There are two main challenges in the implementation and application of AI algorithms in the DP workflow: the generalizability of the model and the explainability of the model.20
Generalizability refers to how well the model’s complexity matches the complexity of the data. Overfitting is one of the generalizability problems. For example, a model trained on many specific cases and slides may struggle with another set of specific cases. The biggest challenge for algorithms is applying what they were trained on to new datasets.20
Explainability, or interpretability, involves an algorithm’s ability to allow users to understand the factors that lead to a specific decision.20 DAIP faces a common issue in AI known as the “black box” problem, where the system’s internal decision-making process cannot be traced. This issue makes it difficult for DAIP users to analyze each decision made by the AI, and therefore they may not be able to provide the best feedback to improve the AI algorithm.34 Transparency is imperative in larger institutional settings, where failure to detect bias could have enormous ethical and legal consequences.
Technological limitations of LLMs
While DAIP has the potential to revolutionize the pathology field, it still has many technological limitations. So far, DAIP programs have mostly focused on image analysis and have not taken advantage of the rise of LLMs. For example, a pathologist can use an LLM such as ChatGPT to ask general questions about a case that image analysis cannot address. However, LLMs still lack a detailed understanding of specific cases, making them less flexible than image analysis technologies.22 A recent study also showed that ChatGPT was helpful in identifying differential diagnoses in 62.2% of simulated cases, but it provided at least one erratic differential diagnosis in 3.7% of the simulated cases.35 Strikingly, among the 214 references ChatGPT provided, 12.1% were irrelevant or inaccurate, and 17.8% were non-existent.35 Therefore, careful review and thorough validation are required before fully adopting LLMs in diagnostic pathology.
Mitigation approach and future directions
While using DAIP undoubtedly signifies advancements in technology, it is essential to critically assess whether it truly meets the unique needs and goals of individual laboratories or institutions. Given the costs and legal/ethical concerns associated with implementing DAIP, manufacturers of DAIP software and equipment should work to minimize implementation costs. Additionally, we must recognize the need to optimize DAIP implementation, including reducing storage costs, improving access to experts and expert opinions, encouraging institutional administration support for installation, and replacing microscopes. In 2020, the primary adopters of DAIP were still large institutions, while smaller and more rural laboratories could also benefit significantly from DAIP implementation.33
AI holds great promise, especially in anatomic pathology. However, storage remains a significant concern.29,33 For example, one hematoxylin and eosin slide can take up to 10 gigapixels of information at 40x magnification, which presents challenges for pathologists to notice and account for all the information—not just on one slide but across many. AI algorithms are able to produce consistent results when provided with the same slides and can identify anomalies even within large volumes of data.
Despite AI’s promising effects on the pathology workflow, one gap that remains is the “translation gap”, where challenges arise in implementing these algorithms in practice. For instance, variations in tissue acquisition and slide preparation processes may influence the performance of image analyses.16 Another challenge is selecting the most reliable performance metric, as accuracy, balanced accuracy, area under the receiver-operator curve, sensitivity, and specificity are all important. The College of American Pathologists has acknowledged this issue and has led the development of recommendations for ML performance,36 although further validation and research are still needed.
Several approaches can mitigate ML biases or fairness issues, and future work on this subject is warranted. First, radiology colleagues have addressed three key components for reducing biases in ML: rigorous dataset curation and data handling, development of robust ML models, and a better understanding and utilization of various performance metrics.37–39 Their experiences may be helpful in mitigating biases and fairness issues in DAIP. Second, regrouping study subjects by the variable of interest could help reduce ML biases and improve AI fairness when classifying cancer outcomes.40 Third, newer ML algorithms, such as transfer learning, fairness-aware classifiers, and unbiased prompts for LLMs,30,41,42 may help reduce ML biases in DAIP implementation. Finally, pathologists and software engineers can only mitigate ML biases if they are aware of them, highlighting the need for related education. Caution should also be exercised when DAIP biases are more prevalent or subtle.
AI and the pathology profession will grow together, influencing each other as AI continues to advance and become increasingly embedded in pathology and other medical practices.9 AI has already been found useful in pathology and medical education and can help identify reporting errors and discrepancies.43,44 Clearly, pathologists need more education on DAIP technology and its applications.28 Residency programs may formally incorporate DAIP into their curriculum as a required or optional rotation in the future, focusing on general and practical subjects. Pathologists who specialize in DAIP will be needed in the near future to bridge AI/data science and pathology practice. These specialists may be trained through designated fellowship programs alongside pathology informaticians but will have significant differences in their deeper knowledge of AI/data science. However, it remains unclear whether their fellowship should be separate from or integrated with current pathology informatics programs.
We anticipate that pathologists will continue to play an important and integrated role in patient care. The non-diagnostic roles of pathologists are unlikely to be replaced by DAIP, as described earlier. However, DAIP will reshape the pathology profession in several ways. First, pathology reports may become more patient-centered and more understandable to laypersons.26 Pathologists may then have direct interactions with patients, aided by LLMs (e.g., copilot).45 Second, DAIP’s assistance will enable the efficient handling of many simpler cases,9,32,46 while pathologists will be able to devote more time to complex cases or direct patient care. This may challenge the value of general surgical pathologists but could be justified in small institutions. Third, seeking second opinions will become much easier, faster, and cheaper on the DAIP platform compared to using glass slides. Many pathologists could work remotely as consultants. Finally, pathologists will need to work hard to “outsmart” DAIP and AI and demonstrate their undeniable clinical value, especially since DAIP may appear more accurate than pathologists in diagnosing certain diseases.5,14,15 One example is melanoma diagnosis. Although dermoscopy and digital dermoscopy were first proposed in 2009,47 recent studies have shown the possibility of dermatologist-like AI tools for diagnosing melanoma.48
LLMs
In addition to image analysis, a future direction that institutions should consider is incorporating LLMs alongside image analysis in pathology. Paired with image analysis, LLMs could prove to be extremely helpful to pathologists, but they also come with their limitations. LLMs are not yet fine-tuned enough for pathological purposes, often providing broad and generic responses. They also present unique challenges—ChatGPT-4, for example, has been known to hallucinate by generating fabricated sources when asked about histopathology, a problem not seen in digital pathology-specific AI. However, there is progress with more domain-specific LLMs, such as BERT (from Google) derivatives (BERT (Bidirectional Encoder Representations from Transformers)) and GPT (from OpenAI) derivatives (Generative Pre-trained Transformer), demonstrating significant advancements in a very new field.49 While LLMs are probably not ready for wide clinical laboratory application yet, they have great potential for streamlining and organizing fields such as public health and pathology.
For this direction with LLMs to be fruitful, four main improvements are essential: collaboration between AI developers and institutions, minimization of bias through training on how to recognize biases, more longitudinal studies on AI use in laboratories across the world, and clear regulations governing AI at every scale (local, regional, and global).22 Without further understanding, research, and data assessing DAIP and LLMs, the research and clinical fields may not be able to fully take advantage of these risky yet potentially valuable tools.