Logout successfull!

Best Evidence Synthesis of the Clinical Performance of Ceramic Bearings in Total Hip Arthroplasty with Focus on BIOLOX®delta

by Dirk Stengel MD, PhD, MSc

BG Kliniken - Klinikverbund der gesetzlichen Unfallversicherung gGmbH
Berlin, Germany


A key principle and driver of global economy is that any company aims to produce a product far better or much cheaper (while of similar quality and function) than that of its competitors to foster its role as a market leader. Both the pharmaceutical and medical device industry have a unique societal role and responsibility in this global competition, as their products may immediately affect the fate of an individual patient, as well as health-related outcomes and function of a population with a certain disorder or injury.

Estimating the utility and value of a health-care intervention demands a thorough trade-off between its reported benefits and harms. Unbiased, transparent, and easy, understandable information will decide whether payers, professionals and patients opt for a certain treatment over another, even if it is associated with higher tangible (i.e., monetary) and intangible costs (e.g., a higher risk of adverse events).

Tribological pairing represents one among an infinite number of variables affecting outcomes after total hip arthroplasty (THA). Demographic baseline profiles, comorbidity, surgical expertise and (minimally-invasive) access routes, navigation, hardware from various manufacturers, cemented or cementless fixation, peri-operative and rehabilitation protocols, and many other factors may have a far greater impact on recovery, function, and long-term revision-free component survival than the individual material constituting the acetabular liner and femoral head.

Keep in mind that most novel treatments, especially in orthopedics, represent step innovations with marginal effect sizes. Showing a difference to the standard of care or other therapeutic options, controlling for multiple confounding items, needs thousands of subjects and datasets.


What is the best available evidence?

Health-care authorities such as the German Institute for Quality and Efficiency in Health Care (Institut für Qualität und Wirtschaftlichkeit im Gesundheitswesen, IQWiG) always pose the following key questions, all of which must be answered in detail to increase the likelihood of coverage by insurers and acceptance by providers:


  1. How valid is the current best available evidence on the safety and effectiveness of a therapeutic intervention (here: ceramic bearings) compared to the standard of care to draw meaningful conclusions, ideally to draw causal inferences?
  2. Is there scientific information from data sources with a low risk of bias, specifically well designed (!) randomized controlled trials (RCT) with a sample size large enough to make robust predictions?
  3. Does the reported effectiveness or benefit in patient-centered outcomes (e.g., function, health-related quality of life) outperform any intervention-specific risk (here: audible noise and/or squeaking, ceramic fractures)?
  4. If there is any measurable difference between interventions, is it both (i.) statistically significant (i.e., beyond the play of chance) and (ii.) clinically relevant (i.e., above a certain threshold recognizable by patients)?


There is minimal consensus among scientists and health-care professionals that a potentially  innovative, useful and valuable intervention requires, at least, a biologically reasonable mode of action (demonstrated by reproducible pre-clinical or animal experiments) which, in theory, may increase the likelihood of better long-term clinical outcomes compared to the standard of care or other thinkable therapeutic options.

Long story short- almost no novel or modern drug or medical device meets or is even close to meeting all criteria.


The buzzword Evidence-Based Medicine (EBM) is used excessively but too often erroneously. The term was coined about 30 years ago by Canadian and British clinician scientists, depicting "the conscientious, explicit and judicious use of current best evidence in making decisions about the care of individual patients".1 A common misbelief of the still applicable basic principles of EBM is that its inventors and propagators equated current best evidence with RCT. Fundamental advancements in information technology, statistical methods, genomics and molecular biology, precision medicine, machine learning, open-access publishing, and many others, subsequently overcame the original concept, which nowadays must be considered historical and outdated.

This is specifically true for the well-known evidence pyramid. Bruce G. Charlton, a retired British medical doctor and professor of theoretical medicine was well known for his often controversial and even bizarre statements - but he was right in stressing "a hierarchy of methods is amazing nonsense".2 This simply means that experimental or non-experimental set-ups must be adapted to the individual problem to be solved- there is no one-size-fits-all ranking of study designs. Also, a poorly planned, conducted and reported, small-sized RCT with unexplained post-randomization drop-outs etc. may be no better, solid, or meaningful than a case series, while results from a large-scale registry with nearly complete long-term follow-up may substantially influence clinical practice and health-care decisions.

More flexible approaches to assess the trustworthiness and relevance of scientific evidence are:


  1. the Grading of Recommendations, Assessment, Development and Evaluation (GRADE) instrument, continuously developed and evaluated by the international GRADE Working Group (https://www.gradeworkinggroup.org), which considers individual methodological features of a clinical investigation rather than its general design,
  2. the second version of the Cochrane Risk of Bias Tool (RoB-2) for randomized trials (https://methods.cochrane.org/risk-bias-2).


Without doubt, systematic reviews (of individual studies or systematic reviews) remain the best source for informed decision making in health-care. If done properly, they may show the advantages and disadvantages of diagnostic tests and therapeutic interventions for a certain condition or disease in an unbiased fashion and bring them into the context of actual scientific and clinical standards. A chain is only as strong as its weakest link, and a systematic review can only be as good as its included individual studies and trials.

A best evidence synthesis may be characterized as a special form of a mixed-methods study. It combines principles of systematic reviews and meta-analyses, systematic reviews of systematic reviews, scoping (or narrative) reviews, and health-technology assessments. It usually covers a wide range of evidence (e.g., systematic reviews and meta-analyses of RCTs and cohort studies, individual trials, registries, routine, and administrative data etc.), as derived from a formal, reproducible systematic search of the literature in multiple databases (e.g., PubMed Medline, Ovid Medline and Embase, Cochrane Library, and grey literature), supplemented by a snowball procedure (i.e., a search among related articles and cited references, prompting another search among related articles and cited references, until all related articles and cited references match).

Best available evidence on BIOLOX®delta

CeramTec assigned an independent expert to compile the best available evidence on the effectiveness and safety of ceramic bearings focusing on BIOLOX®delta in THA. This included a reproducible search among different databases (e.g., Ovid Medline and Embase), quality assessment, data extraction from original articles and aggregation using advanced statistical methods. Best evidence syntheses share features of so-called living systematic reviews which continuously incorporate and adapt objectives conditional on new information. Originally planned as a single comprehensive overview, multiple new questions arose after digging deeper and deeper into the available amount of data. Consequently, the work had to be split into three consecutive parts, each of which explored different data sources under specific scopes and perspectives.

Current experimental evidence from RCTs of ceramic-on-ceramic (CoC) and ceramic-on-polyethylene (CoP) compared to metal-on-polyethylene (MoP) turned out to be sparse and methodologically weak. To date, there is no large-scale, confirmatory multicenter RCT comparing CoC and / or CoP to alternative pairings in THA.

Unlike the U.S. Food and Drug Administration (FDA), health authorities in Europe, particularly in Germany, tend to be disobliging to working with industry in developing sound but feasible study designs to answer questions of importance to both manufacturers and the health care system. The IQWiG, Federal Institute for Drugs and Medical Devices (Bundesinstitut für Arzneimittel und Medizinprodukte, BfArM), and most Notified Bodies in Germany share the position that it is solely the industry’s responsibility to deliver the necessary clinical evidence for decision making, without providing specific guidance how this should be done. Absence of evidence of a benefit given the lack of RCTs is not evidence of absence of a benefit by other data sources. Producers must, of course, prove their product is equally or more effective compared to the (leading) comparator on the market, while being safe. However, according to the famous philosopher Hans Albert, the so called first bridge principle reads "Shall implies Can". (Albert H. Traktat über kritische Vernunft. 5th Ed. Mohr, University of California, 1991.) This means health care authorities cannot insist on a proof of effectiveness by a large-scale RCT if a large-scale RCT is impossible to be conducted for comprehensible and transparently explained reasons.

However, current best available evidence from international joint registries gives strong indications and tendencies that BIOLOX®delta bearings (CoC and CoP) in THA are associated with a lower overall risk of revision, mainly driven by a lower risk of revision for periprosthetic joint infection.

Cumulative 2 to 13 year survival of THAs with BIOLOX®delta bearings range from 94 to 100%, accompanied by significant improvements in function and pain comparable to other couplings.

The incidence of audible noise and/or squeaking reported in clinical studies ranges from 1.6 to 6.8%, with heterogeneous definitions and assessment procedures. There is no consistent or statistically conspicuous association between noise and/or squeaking and pain, function, patient-reported outcome measures (e.g., OHS or WOMAC), and revision rates.

The pathophysiology of audible noise and squeaking with CoC remains unclear. Of note, even if  reported by patients, the phenomenon cannot always be reproduced during objective physical  examination. There is no valid tool to measure or quantify noise in a longitudinal fashion, and, with the exemption of Australia3, noise is not routinely recorded in joint replacement registries  worldwide. Noise does not signal ceramic fracture and is a rare cause of revision. One may even assume a so-called recall bias, as single case reports of CoC associated squeaking prompted patients after a long period without any complaints to report about noise sustained in the past.

Ceramic compounds, however, prevent biofilm formation, as substantiated by multiple laboratory experiments. The observed lower rate of revision for deep, implant-related infections thus has a pathophysiological explanation.

This best evidence synthesis suggests that:

  • It is 15 to 33 times more likely ceramic bearings avoid a revision for infection than causing a revision for audible noise.
  • It is 38 to 85 times more likely ceramic bearings avoid a revision for infection than causing a revision for ceramic head fractures.
  • It is 3 to 6 times more likely ceramic bearings avoid a revision for infection than causing a revision for ceramic liner fractures.

This comprehensive review suggests a favorable benefit-risk-ratio of ceramic bearings (CoC and CoP) compared to other couplings in total hip arthroplasty (THA). While the lack of confirmatory evidence from large-scale randomized controlled trials (RCTs) cannot und must not be denied, registry data speak a clear language.

Ceramic components manufactured from alumina (BIOLOX®forte) and alumina matrix composite (BIOLOX®delta) have established themselves as durable bearings in total hip arthroplasty (THA), either as ceramic-on-polyethylene (CoP, with all its advancements like highly cross-linked polyethylene [HXLPE], and others), or ceramic-on-ceramic (CoC) couplings. Scientific literature puts emphasis on ceramic specific adverse events like squeaking (a phenomenon common to hard-on-hard bearings) and component fractures. However, ceramic bearings show a lower risk of revision for prosthetic joint infection, probably the most serious complication in total joint arthroplasty.


  1. Sackett DL, Rosenberg WM, Gray JA, Haynes RB, Richardson WS. Evidence based medicine: what it is and what it isn't. BMJ. 1996;312(7023):71-72. doi:10.1136/bmj.312.7023.71.
  2. Charlton BG. Fundamental deficiencies in the megatrial methodology. Curr Control Trials Cardiovasc Med. 2001;2(1):2-7. doi:10.1186/cvm-2-1-002.
  3. Owen DH, Russell NC, Smith PN, Walter WL. An estimation of the incidence of squeaking and revision surgery for squeaking in ceramic-on-ceramic total hip replacement: a meta-analysis and report from the Australian Orthopaedic Association National Joint Registry. Bone Joint J. 2014;96-B(2):181-187. doi:10.1302/0301-620X.96B2.32784.