While the reasons the positive results obtained in preclinical studies for urethral repair have not been reproduced in the subsequent clinical trials are complex, poorly designed experiments and a lack of quality in reporting might be some of the main reasons that hinder clinical translation. However, the quality of reporting in preclinical urethral tissue engineering studies remains unclear. Moreover, it is unknown if the quality of reporting has improved as a consequence of the introduction of the ARRIVE guidelines.
Urethral repair is considered a complex task for urologists, where the greatest demanding clinical needs occur following urethral strictures in adults and congenital anomalies (e.g., hypospadias) in children. In adults, it has been estimated that 0.1% of men above the age of 65 years suffer from urethral strictures, which can occur secondary to different etiologies, including pelvic trauma, lichen sclerosus, non-specific urethritis, and iatrogenic injuries . The selection of the surgical management approach is based on the extent and etiology of the stricture. For distal strictures, the approach includes initial endoscopic urethrotomy or dilatation, and subsequent surgical interventions are usually reserved to when the urethrotomy fails, or the stricture recurs. For short strictures, anastomotic urethroplasty generally solves the problem. The preferred surgical approach in strictures larger than 1 cm or complicated cases is replacement urethroplasty using an autologous graft that can be used either as a patch or in full circumference. Although different sources of grafting material have been tried, buccal mucosa is currently considered the preferred option owing to its inherent lack of hair, compatibility with a moist environment, and a low rate of donor site morbidity . In pediatric patients, hypospadias has an incidence of around 1 per 300 male newborns, where the shortage of the urethra is a surgical challenge . The currently practiced surgical techniques for the management of these diseases have high complication rates and need specific skills to be applied optimally . For both adult and pediatric patients, there is a wide consensus about the need for further consolidated basic research, including the use of tissue engineering and regenerative medicine techniques for urethral reconstruction .
Among the various animal models used to investigate male urethral repair strategies, the rabbit model has been by far the most popular choice . The male rabbit’s urethra is easily accessible and possesses remarkable histological and functional similarities to the human urethra, such as a thin epithelial layer supported by the highly vascularized spongiosum and a urethral smooth muscle layer contributing to the urethral tone . In addition, the size of an adult rabbit’s urethra is comparable to that of a male infant, allowing the use of transurethral instrumentation and procedures employed in pediatric surgery. Studies using rabbits have, therefore, been instrumental in demonstrating the feasibility of urethral reconstruction using a variety of synthetic and natural polymeric matrices . Several approaches have been under scrutiny, most of which have shown the ability to support the recovery of normal urethral architecture and function, with a similar performance to that of autologous tissue grafts. However, a recent meta-analysis of 63 preclinical and 13 human studies of tissue engineering for urethral reconstruction revealed that the efficacy of these approaches could not be defined because of the lack of well-controlled preclinical investigations. The study also revealed that the promising preclinical results obtained using cell-laden matrices could surprisingly not be translated into the clinical studies .
Studies in other fields have also shown a number of difficulties in assessing the efficacy or translating the results from animal research to the clinical context. The issues include physiological variations among species and strains , lack of randomization and blinding , inadequate reporting of methods and materials, and the publication bias of not describing trials with adverse or indeterminate outcomes, which derivate to an overestimation of the impact of a therapy . In 2009, the National Centre for the Replacement, Refinement, and Reduction of Animals in Research (NC3Rs) examined the nature of the reporting, experimental design, and statistical analysis in 271 published preclinical experiments. The survey showed several shortcomings in study design, statistical analysis, and reporting, and inspired the publication of the Animal Research: Reporting In Vivo Experiments (ARRIVE) guidelines in 2010 . The checklist consists of 20 items that cover the critical data to be reported in a preclinical scientific paper. Despite increased awareness in the scientific community, however, which includes the adoption of the guidelines by more than 1000 scientific journals, the quality of reporting has not significantly improved in various research fields . In 2018, the NC3Rs formed an international working group involving journal editors, researchers, and statisticians from a variety of fields with the aim of reviewing and updating the guidelines . As a result, a revised version of the guidelines has recently been published (ARRIVE 2.0) .
While the reasons the positive results obtained in preclinical studies for urethral repair have not been reproduced in the subsequent clinical trials are complex, poorly designed experiments and a lack of quality in reporting might be some of the main reasons that hinder clinical translation . However, the quality of reporting in preclinical urethral tissue engineering studies remains unclear. Moreover, it is unknown if the quality of reporting has improved as a consequence of the introduction of the ARRIVE guidelines.
Two separate searches were conducted in the databases MEDLINE of PubMed and EMBASE of OVID SP in March 2020. The search terms selected were as follows: rabbit, tissue engineering, stem cell, scaffolds, autologous graft, urethral graft, urethral reconstruction, regenerative medicine, reconstructive surgery, urethra, and animal experimentation. The search fields were controlled by database fields such as MeSH term, Text Word, and All Fields appropriate to the databases. “Publication date: 01/01/2014 to present” and “English language” filters were used. As the ARRIVE guidelines were first released in 2010, we selected to start our search in 2014, assuming that four years would allow the authors of preclinical studies sufficient time to plan, perform, and publish the results according to the guidelines. Details of the search are represented in the PRISMA flow diagram ( Figure 1 ).
Extraction into a standardized data framework derived from the ARRIVE checklist  was conducted by two independent reviewers (A.K.P.S. and A.A.). The ARRIVE guideline consists of 20 items, some of which are further divided in subitems. For the purpose of this work, a list of 38 items was elaborated, which includes the items and subitems from the original ARRIVE guideline . Each of these 38 items was evaluated as “Yes” or “No” to indicate whether it was reported in the study or not. For some of the questions, a third option, “Not Applicable (N/A)”, was included to indicate items that were not relevant for the study (for example, in experiments employing only one experimental group, item 11a concerning allocation becomes N/A). Specific operational instructions were provided to both reviewers before they read the selected full-text articles and extracted the data blinded to the analysis from the other reviewer. A training phase through the detailed description and examples of scoring was conducted among the authors before the commencement of the data extraction. Inconsistent data were consequently settled by an additional independent researcher (T.A.).
The data were compiled employing a Microsoft Excel spreadsheet and analyzed using IBM SPSS Statistics (version 21). For each of the selected studies, a score was calculated, which represents the percentage of positively reported items. The score was calculated using the following formula:
1S c o r e = ( N y e s 38− N n a ) × 100 where Nyes = number of “Yes” entries, Nna = number of “Not Applicable” entries, and 38 is the total number of items in the ARRIVE guideline. The units of analysis were the individual articles when assessing the scores, and the single ARRIVE item when assessing their adherence across studies. A further analysis was performed to assess the adherence of the studies to several subitems within the ARRIVE checklist. A Mann–Kendall nonparametric test was used to assess whether the scores had a monotonic trend over the years. This is a simple, but robust non-parametric test that does not require the data to be normally distributed or follow a linear trend. The intraclass correlation coefficient (ICC) analysis was utilized to examine the inter-rater agreement between the two reviewers. The ICC was selected because it reflects both degree of correlation and agreement between measurements. ICC values <0.5 are indicative of poor reliability, 0.5–0.75 indicate moderate reliability, 0.75–0.9 indicate good reliability, and >0.90 indicate excellent reliability . Statistical significance was set at p < 0.05.
As shown in Figure 1 , a total of 189 articles were initially screened after the literature search. Following the inclusion and exclusion criteria, a total of 43 studies were selected for full-text reading ( Figure 1 ). Only 28 studies were considered eligible for quality appraisal . These studies comprise a range of approaches and scaffolds for urethral repair in rabbits, which are summarized in the Supplementary Table S2 . The table provides details about the strain, sex, age, weight, number of animals, graft approach, material, and duration of the implantation. The numbers of rabbits in each experiment was 20 on average and ranged between 7 and 36, and the post-implantation follow-up duration was 5 months on average and varied between two weeks and nine months. All studies used male rabbits, except one study that included male and female animals. The most commonly studied approach was using acellular matrices as a patch (21 studies; length average (15 mm) range (5–20 mm)) versus tubes (7 studies; length average (20 mm) range (10–30 mm)).
The scores for each of the checklist items are shown in Figure 2. The data were clustered into three groups to evidence the level of adherence to the guidelines of each of the 38 items in the checklist. While 34% of the items (13/38) appeared under the green category (agreement between 80 and 100%), 21% of the studies (8/38) appeared in the orange category (agreement between 50 and 79%), and 45% of the items (17/38) were associated with the red category (agreement between 0 and 49%).
The items that attained the highest scores included the number of animals utilized, the size of experimental and control groups, and the definition of experimental outcomes. A description of the study background, including context and rationale, was also adequately provided in all of the studies. However, variables relevant to the reproducibility of the experiments were not often disclosed. The least frequently reported checklist items (only found in ≤10% of the studies) were items 18c (interpretation), 7b (experimental procedures), 17b (adverse events), 6c (experimental unit), 13b (analysis unit), 7d (procedure rational), 11b (animal allocation), 13c (statistical design), and 10b (sample size). These items are related to information on test methods, sample size calculation, statistical approaches, adverse events, and interpretation/scientific outcomes. Although statistical methods were disclosed and described in almost two-thirds of the articles, the statement of statistical methodology was frequently inadequate. Most articles did not include information about data distribution, definition of the unit of analysis, or justification for choosing a specific analytical method. Surprisingly, none of the articles disclosed the approach to verify that the assumptions for the statistical methods were met. Although every article described the experimental results, some of them failed in specifying the primary and secondary outcomes.
The adherence of the studies to the different parts, rather than the overall item, is shown separately in a table format ( Table 1 ). Concerning the compliance to ethical standards, 92% of the analyzed studies stated that the protocol was approved and 89% referred to national or international guidelines ( Table 1 a). However, a complete ethical statement, disclosing both the approval of the protocol and guidelines followed, was only present in 46% of the studies. Concerning study design, there were various essential items that were poorly reported. While recording of randomization (6.b) scored 46%, none of the 28 studies reported sample size estimation or steps to reduce assessment bias ( Table 1 b). Regarding item 7a (experimental procedure), while it scored 96% owing to the majority of studies reporting surgical procedures and anesthesia, very few studies reported post-operative analgesia or euthanasia ( Table 1 c). Regarding the details of the animals, the most frequently reported information was the weight. However, only 14% of the studies provided the age of the animals ( Table 1 d). Details concerning the animal’s housing (9a) were infrequently listed (less than 30% of all studies). Only few studies reported the type of facility, cage, bedding material, or number of cage companions ( Table 1 e). Data about nutritional aspects and environment, such as temperature, humidity, and access to water and food, were also infrequently reported. No environmental enrichment was reported in any of the studies ( Table 1 f). In relation to the study limitations (18.b), most studies stated general limitations and potential sources of bias. However, only few articles addressed the limitations of the animal model or imprecision of the results ( Table 1 g).
|(a) Item 5: Ethical statements||Ratio||%|
|Refers to guidelines||25/28||89|
|(b) Item 6b: Steps to minimize subjective bias in the study design|
|(c) Item 7a: Information about experimental procedures|
|Method of euthanasia||4/28||14|
|(d) Item 8a: Details about animals in the experimental design|
|Weight and age||4/28||14|
|Weight but not age||18/28||64|
|Age but not weight||0/28||0|
|(e) Item 9a: Information about housing conditions|
|Type of facility||5/28||18|
|Type of cage||3/28||11|
|Number of cage companions||1/28||4|
|(f) Item 9b: Information about nutritional aspects and environment|
|Access to water / food||3/28||11|
|(g) Item 18b: Disclosure of limitations in the results´ interpretation|
|General limitations, including potential sources of bias||21/28||75|
|Limitations of the animal model||10/28||36|
Our study recognized that published animal experiments studying tissue engineering approaches for urethral repair display inadequate reporting of fundamental information. The quality of reporting improved only marginally over the study period. Inadequate reporting of the critical points of research experiments could remarkably affect the clarity, reproducibility, and translatability. We encourage the utilization of the ARRIVE checklist items when reporting preclinical studies to help the publication of manuscripts that would allow a precise judgment of their scientific merit. This has the potential to enhance both the translatability of the findings to humans and the fulfilment of the ethical requirements and further supports objective comparison between different studies.