Survival and comorbidities in lung cancer patients: Evidence from administrative claims data in Germany

Lung cancer is the most common cancer type worldwide and has the highest and second highest mortality rate for men and women respectively in Germany. Yet, the role of comorbid illnesses in lung cancer patient prognosis is still debated. We analyzed administrative claims data from one of the largest statutory health insurance (SHI) funds in Germany, covering close to 9 million people (11% of the national population); observation period was from 2005 to 2019. Lung cancer patients and their concomitant diseases were identified by ICD-10-GM codes. Comorbidities were classified according to the Charlson Comorbidity Index (CCI). Incidence, comorbidity prevalence and survival are estimated considering sex, age at diagnosis, and place of residence. Kaplan Meier curves with 95% confidence intervals were built in relation to common comorbidities. We identified 70,698 lung cancer incident cases in the sample. Incidence and survival figures are comparable to official statistics in Germany. Most prevalent comorbidities are chronic obstructive pulmonary disease (COPD) (36.7%), followed by peripheral vascular disease (PVD) (18.7%), diabetes without chronic complications (17.4%), congestive heart failure (CHF) (16.5%) and renal disease (14.7%). Relative to overall survival, lung cancer patients with CHF, cerebrovascular disease (CEVD) and renal disease are associated with largest drops in survival probabilities (9% or higher), while those with PVD and diabetes without chronic complications with moderate drops (7% or lower). The study showed a negative association between survival and most common comorbidities among lung cancer patients, based on a large sample for Germany. Further research needs to explore the individual effect of comorbidities disentangled from that of other patient characteristics such as cancer stage and histology.


Introduction
Lung cancer is the most common cancer type worldwide with close to 2.1 million incident cases in year 2019 [1]. In Germany, 59,221 individuals were newly diagnosed with lung cancer and 44,881 died of it in year 2019 [2]. It is the leading and second leading cancer type in terms of mortality for men and women respectively, causing 22% of cancer related deaths in men and 16% in women [2]. Incidence is expected to increase in the following years, mainly driven by the diagnoses for women [3].
There is a consensus that the comorbidity burden among lung cancer patients is high [4][5][6][7]. This is because of the strong relationship between smoking and lung cancer, cardiovascular diseases and other respiratory tract diseases. Cardiac toxicity following radiotherapy, chemotherapy or immunotherapy may contribute to the risk of concomitant diseases after treatment as well [8]. Comparative worse prognoses for lung cancer patients with comorbidities is mainly a result of lower odds of underdoing surgical resection, receiving or completing chemotherapy treatment, as well as therapy dose reduction, increasing the likelihood of an extended length of stay in hospital and of developing postoperative complications [9][10][11][12][13][14][15][16][17]. The presence of concomitant diseases could also interfere with a complete diagnostic evaluation and consequently accurate staging. Comorbidities might also be related to shorter survival because of their association with older ages [16].
Nevertheless, researchers have not yet fully agreed on the impact comorbidities might have on lung cancer patient chances of survival. Some studies have indicated a no significant to small effect [6,[18][19][20][21]. These studies observed lung cancer patients with comorbidities had a similar prognosis to those without comorbidities. It has been suggested that the absence of treatment according to the guidelines is responsible for shorter survival times and not necessarily due to the comorbidity per se [20,22]. In addition, individuals might need to attend regular checkups treated at relevant clinical departments in the presence of comorbidities, leading to earlier detection of the lung cancer and thereby to early treatment [11,22]. Given this conflicting evidence in the literature, there is an important need to further explore the influence of comorbid illnesses in patient survival in order to prevent suboptimal disease management and to better tailor treatment.
In this context, the objective of this study is threefold: first, to estimate lung cancer incidence based on administrative claims data that is validated with the statistics from the national cancer registry in Germany; second, to recognize the prevalence of comorbidities among lung cancer patients in relation to individual characteristics and their evolution over time; and third, to analyze the association between comorbidities and survival rates for lung cancer patients.

Data source
We analyzed administrative claims data from Barmer health insurance, one of the largest statutory health insurance (SHI) funds in Germany, covering close to 9 million people (11% of the national population) [23]. Access was granted by the Scientific Data Warehouse of Barmer (W-DWH) with pseudonymized identities of the insured. The study was performed in accordance with the Declaration of Helsinki and follows established principles of good practice in secondary data analysis [24]. The dataset encompassed individual level information from inpatient and outpatient (in both hospitals and ambulatory services) records for the 15-year period from 2005 to 2019. It includes information on individual characteristics, such as sex, year of birth, place of residence, affiliation date and disenrollment date and their respective reasons. Outpatient services were registered as hospital outpatient if offered directly in a hospital and as ambulatory if offered elsewhere. The reporting system employed the German Modification for the 10 th revision of The International Statistical Classification of Diseases and Related Health Problems (ICD-10-GM) to list patient diagnoses. The documentation of the data source was structured in year-quarters, therefore the analysis was quarterly based.

Patient identification
We identified incident cases on a year-by-year basis. In accordance to McGuire et al. [25] and Schwarzkopf et al. [26], patient identification was initiated by considering all individuals with a lung cancer diagnosis, classified with the ICD-10-GM code C33 or C34, in a particular year. To filter out cases in which were uncertain of diagnosis, we included only individuals with either one inpatient primary diagnosis, or two hospital outpatient or ambulatory confirmed diagnoses in consecutive quarters. Additionally, we excluded patients with prior lung cancer diagnosis or treatment in the previous two years, with the purpose of capturing only incident cases. We also omitted cases involving individuals with at least one inpatient diagnosis or one confirmed hospital or ambulatory outpatient diagnosis in the previous two years. Only individuals with Barmer insurance coverage within at least 24 months before and 6 months after the date of diagnosis (unless death occurred before) were included, to ensure sufficient records within the insurance fund. The diagnosis date for an individual was set as the earliest possible date among the first inpatient admission and the first outpatient visit to a hospital or ambulatory service. Moreover, individuals with inconsistent gender information were additionally excluded. In case of multiple places of residence, the location with the longest time in residence was chosen as the actual place of residence. Individuals younger than 18 years of age were excluded as well, in order to reduce the effect of outliers. And lastly, those insured by the DBKK, a health insurance integrated to Barmer in 2017, were excluded, as their records were incomplete for the time frame. The final panel obtained covers the period from 2007 until 2018.

Comorbidity identification
Following guidance in Edwards et al. [27] and Morishima et al. [28], comorbidities in lung cancer patients were identified by retrieving primary and secondary inpatient diagnoses one year before and three months after the lung cancer diagnosis date. Furthermore, we employed the categories listed in the Charlson Comorbidity Index (CCI), a validated and widely used index, to classify the comorbidities into 17 groups [11,12,17,20]. We used an algorithm developed in Quan et al. [29] and Quan et al. [30] for the grouping with their respective ICD-10-GM codes. In the classification process, we excluded two comorbidity groups, namely malignancy and metastatic carcinoma, as they could be the result of lung cancer itself. Single comorbidity diagnoses as well as the number of comorbidities were employed to assess the impact of comorbid illnesses on lung cancer survival.

Statistical analysis
Lung cancer incidence was estimated after relating the identified incident cases with the entire Barmer population. Comorbidity prevalence was calculated for each comorbidity group as the percentage of lung cancer patients diagnosed with that particular comorbidity. Survival rates were obtained for three and six months after the diagnosis date, as well as for one, three and five years after. Incidence, comorbidity prevalence and survival rates were retrieved for each year and also by sex, age at diagnosis and place of residence. Time trends in these metrics were recognized by observing three year averages across the period of analysis. If estimated averages were consistently (i.e., without exception) increasing throughout the sample, the trend was then labeled as increasing, likewise for decreasing averages. Kaplan Meier curves were estimated by censoring the sample at December 31, 2019 and excluding the patient population that terminated their insurance affiliation for reasons different than death. Confidence intervals (CI) at the 95% levels were built around the curves in order to compare surviving probabilities across subpopulations. The analysis were performed using R software, version 3.6.0.

Lung cancer incidence
The number of new lung cancer diagnoses identified was 70,698 during the period of analysis. Table 1 presents this figure by year of diagnosis, as well as percentages relative to new diagnosis within each year and stratified by sex, age at diagnosis, and place of residence. Incident cases vary between 5,108 and 6,504 per year. It is important to highlight that, in relation to the total population of Germany, persons insured through Barmer are more frequently women, belong to older age groups, and reside more often in the state of North Rhine-Westphalia and are less often residing in the states of Baden-Wurttemberg and Bavaria. Incidence numbers are displayed in Table 2, while the same in age-standardized form appear in Appendix 1. These incidence figures were calculated with a population base without any age restriction. The latter suggests that age-standardized incidence for our sample is similar to that reported by the German Centre for Cancer Registry Data for the whole of Germany [2]. Figures for women are nearly identical; however, those for men are on average eight points below national statistics, although the gap seems to be closing in recent years. Table 2 reveals an overall increasing trend in incidence throughout the entire time period, starting at 60.3 cases per 100,000 individuals in 2007 and ending at 78.4 cases per 100,000 individuals in 2018. This trend was observed in both men and women, although incidence in men is close to 50% higher than incidence in women. Incidence is highest in the age group 65-80, and it is also the only age group which exhibited an increasing trend, growing from 171.3 cases per 100,000 individuals to 215.9 cases per 100,000 individuals. In contrast, incidence steadily decreased in the age group 35-50 over the whole period. The states of Bremen, Hamburg, Saarland, Saxony and Thuringia presented stable incidence figures. All other states showed an increasing trend. Incidence was particularly high in Schleswig-Holstein with around 100 cases per 100,000 above the sample average, and particularly low in Mecklenburg-Vorpommern with almost 20 cases per 100,000 below the sample average.

Comorbidity prevalence
Comorbidity prevalence statistics can be found in Table 3. It displays the percentage of lung cancer patients with a particular comorbidity diagnosis, as well as with a number of comorbidities. As shown, close to 30% of lung cancer patients were not affected by any comorbidities, while 50% were by one or two comorbidities, 15% by three or four, and less than 5% by five or more. It is also noteworthy that the percentages of lung cancer patients without comorbidities had a downward trajectory over the period of analysis, while those with three and more an upward trend. Chronic obstructive pulmonary disease (COPD) was by far the most common comorbidity, diagnosed in more than a third of lung cancer patients. It was followed by peripheral vascular disease (PVD), diabetes without chronic complications, congestive heart failure (CHF) and renal disease, affecting between 10% and 20% of lung cancer patients on average and presenting an increasing trend over time. Peptic ulcer was the only comorbidity with a decreasing trend within the timeframe analyzed. Table 4 presents comorbidity prevalence numbers by sex and age group. It suggests that men and older individuals have relatively more comorbidities than other lung cancer patients. The five most prevalent comorbidities, previously mentioned, were the same in both men and women; however, these affect men on average 5% more often than women. Moreover, COPD was prevalent in more than 10% of the lung cancer patients in the youngest age group. This 10% threshold was also reached by PVD and diabetes without chronic complications in the 35-50 age group, as well as by CHF, renal disease and cerebrovascular disease (CEVD) in the 65-80 age group, and finally by dementia in the 80+ age group.

Lung cancer survival
The survival rate is presented in Table 5   Note: If the three year average is always higher (lower) than the three year average of the period immediately before, the trend is labeled as increased (decreasing). An increasing trend in incidence is denoted by (+), a decreasing trend by (−).
(one-year), 67.4% (three-month) and 80.8% (one-month). In addition, survival rates improved across time for the longer time windows (five-year, three-year and one-year) while they remained stable for the shorter time windows (sixmonth and three-month). Survival was consistently higher for women than men, with differences under 10 percentage points. For men, an upward trend is observed in the five-year and three-year survival rates. In women, an upward trend was observed in the three-year survival rate, while a downward trend in the five-year survival. Most of these trends were also identified by the German Centre for Cancer Registry Data for the whole of Germany, as seen in Appendix 2 [2]. Nevertheless, our survival numbers are on average 5% higher, with differences being less accented in men and for the five-year mark. Figures provided by the Global Surveillance of Cancer Survival Programme (CONCORD-3) are more relatable. As shown in Appendix 3, they estimated a five-year age-standardized survival rate of 18.3% in 2010 for Germany, which was 3% lower than our rate for the same year [31]. Furthermore, the survival rate was lower in the older age groups ( Table 5), regardless of the time window analyzed. Differences were, however, more pronounced for longer time frames. For example, the difference between the youngest and the oldest age group is nearly 60 percentage points for the five-year survival, and almost 30 percentage for the three-month survival. Improvement in survival rates were observed more often, but not exclusively, in the 50-65 and 65-80 ages groups during the sample period. Declines were seen only for the 18-35 and 80+ age groups in the five-year survival.
Lung cancer survival and comorbidity Fig. 1 depicts the Kaplan-Meier curves for all lung cancer patients identified in the sample, as well as for those diagnosed with the most comorbidities or affecting at least 10% of lung cancer patients. As observed, except for COPD, a diagnosis from any of the major comorbidities is associated with a significantly lower probability of survival. Relative to overall survival, the largest drops were observed for CHF (13%, 10%, 9% and 8% less in one-year, three-year, five-year and ten-year survival probabilities, respectively), CEVD (12%, 9%, 8% and 8% less) and renal disease (11%, 9%, 8% and 8% less). Smaller drops were found for PVD (7%, 7%, 7% and 6% less) and diabetes without chronic complications (6%, 6%, 5% and 5% less). The probability of survival associated with COPD was significantly lower than the overall survival only after six years from the diagnosis and this difference was never larger than 2%. On the other hand, patients not reporting any comorbidity were associated with higher survival probabilities (8%, 8%, 7% and 7% more). Note: Comorbidities are classified with the categories listed in the Charlson Comorbidity Index (CCI). COPD stands for chronic obstructive pulmonary disease, PVD for peripheral vascular disease, CHF for congestive heart failure and CEVD for cerebrovascular disease. If the three year average is always higher (lower) than the three year average of the period immediately before, the trend is labeled as increased (decreasing). An increasing trend in incidence is denoted by (+), a decreasing trend by (−).
Moreover, Kaplan-Meier curves by the number of comorbidities are shown in Fig. 2. As shown, the curves are nearly identical for the overall sample and for the group with one or two comorbidities. Even though the mean survival probability was slightly lower for the latter, it is never significantly different than that of the overall sample at any point after the diagnosis. In contrast, the toll for the remaining groups was large: the probability of survival was 10%, 8%, 8% and 7% lower for the one-year, three-year, five-year and ten-year time windows respectively for the group with three or four comorbidities, and 15%, 11%, 11% and 10% lower for the group with five comorbidities or more.

Discussion
The present study identified 70,698 lung cancer incident cases covering all of Germany and across a 12-year period between 2007 and 2018, making it the largest lung cancer patient sample-to which we are aware-in Germany. We validated the representativeness of the sample by comparing our obtained age-standardized incidence figures with those from the German Centre for Cancer Registry Data, which is responsible for pooling and assessing every reported cancer case it receives from the population based cancer registries in each German federal state. Our incidence for women was nearly identical to national statistics over the 12 years analyzed; though, our incidence for men was higher [2]. This might be partially explained by the socioeconomic background of patients insured with Barmer [32]. While Barmer has traditionally been a health insurance fund for while-collar workers, lung cancer appears to be more common among low income individuals in Germany, due to higher smoking rates [33,34]. Moreover, the overall increasing trend in lung cancer age-standardized incidence in Germany for the 2007-2018 period is consistent with numbers for the whole of Germany from the German Centre for Cancer Registry Data [2].
Comorbidity prevalence in our sample was in line to that found in Islam et al. [12]. However, the analyses by Iachina et al. [11] and Nilsson et al. [20] identified a smaller proportion of lung cancer patients with comorbidities, at 53% and 44% respectively. This outcome might be explained by both studies focusing on non-small cell lung cancer (NSCLC) patients, for which a smoking history is usually less common [35]. In addition, these studies retrieved their comorbidity profiles from patient registries, which tend to report diagnostic profiles in less detail compared to administrative claims or medical records, for incentive reasons [36,37]. This is also the case in Seigneurin et al. [21], in which 56% of lung cancer patients bear at least one comorbidity. On the other hand, Tammemagi et al. [16] and Grose et al. [5] reported larger numbers of lung cancer patients with comorbidities, 88% and 87%, respectively. However, they included a longer comorbidity list compared  to that of the CCI. Comorbidity prevalence found in Wang et al. [17] and Feng et al. [38] was higher as well at 81% and 78% respectively, most likely due to the samples being restricted to older individuals who tend to bear more comorbidities.
The most prevalent concomitant diseases in our study were similar to those in the Annual Report to the Nation on the Status of Cancer, based on a sample of more than 160,000 lung cancer patients in the US [27]. The only exception was for the prevalence of renal disease, for which we estimated a considerably larger percentage, presumably because the latter identified only cases of chronic renal disease. Likewise, at least three of the five most prevalent comorbidities in Islam et al. [12], Grose et al. [5], Murawski et al. [6], and Tammemagi et al. [16] coincided with the five most frequent comorbidities in our sample. Nevertheless, the proportion of lung cancer patients with COPD is lower in Islam et al. [12] and Murawski et al. [6] compared to our results, most likely due to sample differences. Relative to our sample, patients are on average older in Islam et al. [12] and predominantly male in Murawski et al. [6], for which COPD tends to be more prevalent [39]. Other comorbidities that were highly prevalent in previous research were either illness symptoms or severity indicators that are not covered by the CCI.
With respect to survival, time trends for the short-and long-term, as well as divergence across sex, found in this study, were also identified by the German Centre for Cancer Registry Data for the whole of Germany [40]. Nevertheless, our survival probabilities were on average higher, with less discrepancy for men and the overall five-year mark. These differences could be masking disparities between the age structure of our sample and German population. In this regard, relatable age-standardized survival rates are provided by the Global Surveillance of Cancer Survival Programme (CONCORD-3), although our estimation is still slightly higher. A greater probability of survival among lung cancer patients without comorbidities was also observed in Islam et al. [12], Tammemagi et al. [16], Kravchenko et al. [13], Welch et al. [22] and García-Pardo et al. [10]. However, our estimated survival rates cannot be directly compared, given that these studies controlled for other patient characteristics. In contrast, Seigneurin et al. [21] and Murawski et al. [6] found that lung cancer patients were not necessarily better off in terms of survival when comorbidity free, although significant differences did exist by histology in Seigneurin et al. [21] and by treatment regime in Murawski et al. [6]. Nonetheless, Murawski et al. [6] did not control for cancer stage and histology, which may explain this non-negative effect. Seigneurin et al. [21] did control for these and nevertheless obtained a non-significant effect of comorbidities on survival among NSCLC patients. Furthermore, the relatively small or even non-significant association between survival and COPD, the most frequent comorbidity, was supported by Islam et al. [12] and Tammemagi et al. [16]. Patients with COPD were more often diagnosed at earlier lung cancer stages, most likely as a consequence of close monitoring by a physician or treatment at relevant clinical departments that thereby have the potential opportunity for earlier detection [11]. Except for PVD, Islam et al. [12] and Tammemagi et al. [16] also found a negative association between survival and the remaining most important five comorbidities (PVD, diabetes without complications, CHF, renal disease and CEVD). The lower survival in lung cancer patients with CHF might be caused from lower likelihoods of undergoing surgery and chemotherapy, in particular in earlier stages where the presence or absence of CHF is crucial in the treatment choice [9,16]. A similar conclusion was provided by Welch et al. [22]. In contrast, Kravchenko et al. [13] associated lower lung cancer survival with increasing operative mortality among patients with CHF. As for lung cancer patients with renal disease, poorer survival probabilities could result from modified doses or complete discontinuation of chemotherapy at later stages, which is often platinum based and therefore not recommendable for those with kidney disease [12]. For PVD, the effect was not significant in the results of Islam et al. [12] and even positive in Tammemagi et al. [16]. This could suggest that the probability of survival in lung cancer patients with PVD was essentially explained by other treatment and individual effects rather than the comorbidity itself.
Overall, this study provides a detailed picture on the comorbidity profile of lung cancer patients in Germany with a representative sample covering all sixteen federal states and a period of time longer than a decade. It is necessary to supply figures in this matter, as the German Centre for Cancer Registry Data does not collect information on comorbidities, a crucial element in understanding lung cancer survival. In this respect, we have delivered evidence on its association with most common comorbidities, and thereby a clearer characterization of lung cancer prognosis. This could facilitate improvement in treatment selection and strategies that enhance survival and quality of life in lung cancer patients.

Limitations
The main limitation of our analysis was the inability to control for individual characteristics that might influence lung cancer patient survival. This is because administrative data does not include information on, for instance, tumor stage and histology. We cannot, therefore, disentangle the effect of comorbidities on survival rates by means of multivariate regression methods. Tumor stage and histology have been shown to explain most variation in lung cancer survival, and by excluding them, an econometric analysis would inevitably suffer from omitted variable biases [4,16,21,41]. For this reason, the reader should interpret our results as associations rather than causation.

Conclusions
The present study reveals the epidemiological picture of lung cancer incidence and comorbidity prevalence, as well as provides evidence on the relationship between survival and comorbidities among lung cancer patients, based on administrative claims data from a health insurance company in Germany. The sample was large and the patient identification strategy delivered incidence and survival figures that were comparable to official statistics for lung cancer patients in Germany. Comorbidities are prevalent in around 70% of lung cancer patients and usually associated with lower probability of survival. The largest impact on survival from common comorbidities is observed in lung cancer patients with CHF, CEVD or renal disease, and a more moderate impact in those with PVD or diabetes without chronic complications. Further research for Germany is needed to disentangle the effect of comorbidities on survival from other patient characteristics such as cancer stage and histology.

Acknowledgement:
We are grateful for the collaboration with the Statutory Health Insurance Company Barmer, especially to Ursula Marschall and Helmut L'hoest for the conception of the article, to Beata Hennig for data analysis support and to Martial Mboulla and Joachim Saam for technical assistance. We thank as well Rachel Eckford for her proofreading of the manuscript.
Funding Statement: This research received no specific grant from any funding agency in the public, commercial, or nonprofit sectors.
Author Contributions: The authors confirm contribution to the paper as follows: study conception and design: all authors; data collection: Diego Hernandez; analysis and interpretation of results: Diego Hernandez, Karla Hernandez-Villafuerte, Chih-Yuan Cheng; draft manuscript preparation: Diego Hernandez, Karla Hernandez-Villafuerte. All authors reviewed the results and approved the final version of the manuscript.
Availability of Data and Materials: This study is supported by Barmer, with whom the Division of Health Economics has a Data Use and Transfer Agreement. Personal data of the beneficiaries were pseudonymized through Barmer before data sharing. Personal identifiers were masked or deleted prior to receiving the data. Quasi-identifiers were generalized (year of birth only, deletion of the last two digit of the zip code, etc.) The processing and analysis of sensitive data took place by remotely accessing the servers at the Scientific Data Warehouse of Barmer under special data protection conditions. The Data Use and Transfer Agreement does not contemplate data distribution to third parties and it is therefore not available.
Ethics Approval: Not applicable.
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.