Development of the scale
Identification of potential items
Own observational (prevalence) surveys in women (patients) with/without hormone use identified many differences in complaints or effects between treated women in the early phase of treatment and untreated women. These were considered as candidates for items of a new scale (unpublished report). A large, observational study in 9 countries in 5 continents contributed also to identification of possible items (unpublished research report and [1,2]).
Another study in Italy (drug safety surveillance study EURAS-OC_Italy, unpublished ) used a check-list of potentially beneficial effects of treatment with oral contraceptives (“Benefit Questionnaire”, not validated). This check-list was also used as pool for potential items (based on clinical experience) for the new scale. In addition, discussions with women (patients) about their experience during treatment were performed in the EURAS-OC Italy surveillance study). Extensive literature reviews were done listing positive and negative short-term effects of hormone use. Moreover, a review items used in validated scales for women related to the concept and domains of the new scale (MRS , FSFI , FSDS , DISF , and ), i.e. also looking at the way the symptoms/complaints were phrased.
This led to a steadily increasing list of possible items. Review and correction of this long list of potential markers and overlapping items by a group of experts experienced hormone studies in women, experts in constructing questionnaires/scales led to a reduction from about 80 items to about 40 items. The latter were put into the raw scale amended by an instruction and examples how to answer.
A cognitive debriefing of the first draft of the draft raw scale was performed as part of the translation of the English draft raw scale into Italian language (10 women: convenience sample within TNS Milan)) and the final raw scale designed thereafter.
Before applying the raw scale in the normative study in an Italian sample, an official pilot- study was executed by TNS Infratest (Milan Italy). 20 women completed the questionnaire and made remarks about clarity of items, introduction, response categories of the Likert scale and the type of presentation of the questionnaire in general, followed by some minor revisions.
Choice of data collection
The scale was designed as paper-based, self-administrable PRO scale. There is no evidence available yet how an electronic or web-based administration would influence the results. A major effect was considered not to be likely.
Choice of recall period
The goal is the assessment of hormonal short-term effects. Therefore, the focus of interest was defined as the recent period (limited to the recent 6 months). It can be assumed, that the experience gathered in the most recent month will dominate patients’ responses.
A Likert scale was used to document the response. The patient has to check for each of the items if the symptoms/complaints apply for her, and if so, how severe/intense or strong it was perceived. The appropriate box has to be marked.
Evaluation of patients understanding
The long raw scale was discussed with women several times during the collection of potential items for the study (see above). It was planned to document the sequence of changes of the raw scale during the development. In the early developmental phase the raw scale (or better item listing) was still in English language.
The formal development of the scale was planned with a sample of women in Italy. A cognitive debriefing was planned as part of the translation into Italian language (see above). The raw questionnaire was altered but not all suggestions were included in the version for the normative study.
As mentioned above, a pilot study of the raw scale was done immediately before going into the field study within TNS Infratest (Milan Italy). Again, only very important items were influenced. It was decided to keep some of the suggestions in mind for future revisions if a meaningful scale could be created at all.
Development of format
The format of the raw and final scale was designed as paper-based scale with a short instruction. Response categories at a Likert scale from 1 to 5 describe the personally perceived severity (or intensity) of the complaints (items). All items were phrased in a negative direction (complaints) following own experiences with the development of other HRQoL scales.
Two examples were provided together with the raw scale to familiarize women with completing the questionnaire.
The raw scale (90 items) was structured in five sections: “complaints/symptoms” (43 items), “use of hormones” (12 items), “reproductive history” (3 items), “lifestyle markers and demographic” (10 items), and “conditions/diseases” (22 items). The latter sections were deemed necessary for the interpretation of the structural (factorial) analysis and for a meaningful item selection process.
The selection of items was based on a series of factorial analyses: A stepwise approach was applied: The developer (Lothar Heinemann) started with the full raw scale (all 90 items) to analyze the internal structure of the raw scale and to understand relations among specific items and demographic variables, medical and reproductive history data. Based on the information gathered with the initial steps, we reduced systematically the number of factors until a minimum with good face validity. Thereafter, the analysis focused exclusively on the specific complaints/symptoms potentially related to hormonal effects (43 items). Using again several factorial analyses, we were approaching the final scale.