WLSH logo
 

Scoring Overview

Descriptions of the scoring criteria and evaluation processes utilized by WSLH PT for regulated qualitative and quantitative analytes follow. Non-regulated analytes are evaluated following similar processes.

Overview: Participants receive PT samples, test them and return results to WSLH PT using the result forms provided. After a series of check-in and verification steps, the results are entered into the scoring database from a data file generated through the scanning process or by manual data entry. The results are evaluated using predefined criteria and event scores are assigned to an analyte/sample/procedure/program. A score is a numeric value reported as a percent (zero to 100%) for a specific event. If the event score satisfies the criteria, it is termed “satisfactory” or “unsatisfactory” if it doesn’t satisfy the criteria. An evaluation report is sent to each participant and any designated agency.

Scoring of the results is accomplished by applying the defined processes from two major categories: enumerated and quantitative. Each of these major categories has subtypes depending on the method of determining the target value (referee or peer group).

Referee laboratories are regular participants in a PT program whose scores over the previous three events are satisfactory (for most programs >= 80 percent). Candidate referee laboratories are selected by the database and presented to the program coordinator for final approval. Care is taken to select a slate of referees (minimum of 10) that represents a cross-section of the participants in a particular PT program.

A peer group is a group of participants using any of the following: a specific test system, a specific analytical principle, a composite test system, or all participants in a program (inclusive or “all methods” group). A peer method group is not used in the scoring process until consensus testing validates it.

A scoring group is the peer method group/referee laboratory result group used to calculate the acceptable result/target value and accepted range for scoring purposes.

Calculating the percent consensus (agreement) of grouped results and comparing it to the specified consensus criteria tests the validity of grouping of results for referee/peer method group scoring. As of April 24, 2003, CLIA regulations use the concept that 80% of the results must match the accepted result or fall within the accepted range in order for the scoring to be considered valid.

Scoring exceptions are handled as follows:

  • Excuse requested status – receives a score of 100 points per result, analyte, sample or program.
  • Late status (results received after the due date) – receives a score of 0 points per result, analyte, sample or program.
  • Missing results for a regulatory analyte/program (one or more of five results left blank) – receives a score of 0 points per result.
  • Less than or greater than – if the number following the less than or greater than sign falls within the accepted range, the result receives a score of 100 points and if the number falls outside of the accepted range, the result receives a score of 0 points.

There are six different scoring processes utilized by WSLH PT; each are explained below.

The quantitative scoring process is used when the results are numeric.

    A. Quantitative scoring by referee laboratory results (e.g. Blood Lead).

    B. Quantitative scoring by peer method group results (e.g. Chemistry).

The enumerated scoring process is used when the results are non-numeric and require translation interaction with glossaries/tables.

    C. Enumerated scoring by referee laboratory results-General (e.g. Hematology Cell ID).

    D. Enumerated scoring by referee laboratory results-Microbiology (ex: Urine Culture).

    E. Enumerated scoring by peer method group results (ex: Immunoserology).

    F. Enumerated scoring for antibiotic susceptibility.


  1. Quantitative Scoring by Referee Back to table of contents

Quantitative programs that use referee labs have a predetermined list of acceptable referee labs indicated as such in the enrollment database. The referee target (mean) and accepted range determination process is as follows:

    1. Calculate the arithmetic mean of the referee laboratories for each sample.
    2. Calculate the accepted range for each sample by applying a specified tolerance factor (stored in a tolerance table) to the referee mean.
    3. Calculate the consensus of the referee labs for each sample by counting the number of results that fall within the accepted range and dividing by the total number of referee labs.
    4. If consensus is >=80%, the mean and accepted range is valid for scoring.
    5. If consensus is <80%, all participants receive a score of 100% for the analyte.
    6. Score all participant results for each sample based on the referee accepted range. If the result falls within the accepted range, the result receives a score of 100 points. If the result falls outside of the accepted range, the result receives a score of 0 points. The event score for the analyte is the summation of all sample points divided by the number of samples.
    7. Standard deviation index (SDI): Once scoring of results is completed, the standard deviation index should be calculated for each result. The SDI is a comparison of a result to the scoring group mean, expressed in terms of the standard deviation and is calculated as follows:

    SDI = Individual result - Mean
    Standard deviation

    1. Quantitative Scoring by Peer Method Group Back to table of contents
      1. Results are grouped according to method codes. A method code consists of 11 digits comprised of the following four parts:
        1. 4 digits that represent the instrument or test kit system
        2. 3 digits that represent the reagent system
        3. 2 digits that represent the measurement principle
        4. 2 digits that represent an internal code used for even more specific grouping purposes

    All results for a given analyte/sample/program may constitute a peer method group (“composite or all methods” group).

      1. The following statistics are calculated using the results for each peer method group:
        1. Mean
        2. Median
        3. Standard deviation (SD)
        4. Coefficient of variation (CV)
      2. After initial calculation of statistics, outlying results are discarded (not included in calculations) using the following rules:
        1. Discard any result that is greater than +/- 3 SD from the mean.
        2. Recalculate mean, median, SD and CV
        3. Discard any result that is greater than +/- 3 SD from the mean.
        4. Calculate “fences” using the median and discard any result that is outside the fences as described below.

          Fence calculation: First the 25th percentile (Q1) and the 75th percentile (Q3) are calculated from the reduced population for each analyte. Then a scaling factor (SF) is defined as 1.5 (Q3-Q1). Finally, the outer fences of the population are set as Q1-2 (SF) and Q3+2(SF). Any result that falls outside these outer fences will be designated as a statistical outlier and discarded from further calculation.
        5. Recalculate mean, median, SD and CV for determination of final target value (mean).

    The accepted range is calculated by applying a specified tolerance factor by analyte (from a tolerance table) to the mean determined for the analyte.

    5. Both the upper and lower limits of the accepted range will be rounded using a standard scientific rounding procedure as follows:

    • If the digit to be dropped is less than 5, the preceding figure is not altered.
    • If the digit to be dropped is greater than 5, the preceding figure is increased by 1.
    • If the digit to be dropped is 5, the preceding figure is increased by 1 if it is an odd number and the preceding figure is not altered if it is an even number.

    6. Consensus for each peer method group should be calculated after determination of the accepted range. Consensus is determined by counting the number of results that fall within the accepted range and dividing by the number of included results in the peer method group. If consensus is >=80%, the peer method group is approved as a scoring group. Once approved, the accepted range is copied to a scoring table. If the scoring group does not meet 80% agreement, all results in that group are deemed satisfactory and receive a score of 100 points.

    7. Once the scoring table is complete, each result is compared to its appropriate scoring group. If the result falls within the accepted range, the result receives a score of 100 points. If the result falls outside of the accepted range, the result receives a score of 0 points.

    8. The event score for the analyte is the summation of its sample points divided by the number of samples.

    9. Standard deviation index (SDI): Once scoring of results is completed, the standard deviation index should be calculated for each result. The SDI is a comparison of a result to the scoring group mean, expressed in terms of the standard deviation and is calculated as follows:

    SDI = Individual result - Mean
    Standard deviation

      C - Enumerated Scoring by Referee-General Back to table of contents
    Enumerated programs that use referee labs have a predetermined list of acceptable referee labs indicated as such in the enrollment database. The referee target determination is as follows:

    1. For each analyte/procedure/sample, define the acceptable results using results reported by referee laboratories. Each sample will have only one result expected for it.
    2. Using the program referee list, find and list all referee results by code (from a glossary); create a Referee Result Summary Table.
    3. Calculate the frequency for each result code.
    4. Count the total number of reported results.
    5. Calculate the percent occurrence for each result code by dividing the frequency by the total number of results and multiplying by 100.
    6. Select and save in a table acceptable results based on agreement of results among referee laboratories (percent consensus).
    7. If >=80% of the referee results are for the same code, it is valid to score participant results based on referee results. The particular code is selected as the correct answer.
    8. If <80% of the referee results are for the same code, more than one result code may be accepted as correct based on the judgement of the program coordinator. The coordinator will either select additional acceptable result codes or decide that it is not valid to score participant results based on referee results. (See discussion under Cell Identification Scoring Criteria.) If it is not valid to score participants based on referee consensus, all participants receive a score of 100% for the analyte.
    9. Score all participant results based on the selected acceptable referee results. Each correct result is given 100 points. If the participant result is not among the acceptable referee results, it is given 0 points.
    10. The event score for the analyte is the summation of all sample points divided by the number of samples.

      D - Enumerated Scoring by Referee-Microbiology Back to table of contents
    Enumerated programs that use referee labs have a predetermined list of acceptable referee labs indicated as such in the enrollment database. The referee target determination process is as follows:

    1. For each procedure/sample, define the acceptable results using results reported by referee labs, referring to the Table of Equivalent Results as necessary. The Table of Equivalent Results contains all possible pathogenic organisms, with each organism having its own list of equivalent (acceptable) results. For example, acceptable responses (depending on the complexity of testing a participant performs) for Pseudomonas aeruginosa might include: Gram negative bacterium, Gram negative rod, Proteus-Pseudomonas group, Pseudomonas species or Pseudomonas aeruginosa.
    2. Count the total number of reported results for each procedure/sample and the frequency of results for each identification code (from the glossary of organisms/codes).
    3. Calculate the percent for each identification code. This is the frequency divided by the total number of results x 100.
    4. Create a Referee Results Summary Table that includes: identification term, identification code from the glossary, frequency of results and percent. A Referee Results Summary Table is created for each procedure and specific sample.
    5. For each referee Results Summary Table, compare the identification codes to the Table of Equivalent Results by organism. Results are acceptable (equivalent) when the reported results are contained in the table. For each identification code contained in the table, sum the percentages of the equivalent results.
    6. If the percent agreement (consensus) among referee labs is >=80%, it is acceptable to score participant results based on referee results. The target organism and its acceptable equivalent results are stored in a table.
    7. Participant results are compared to the target organism and equivalent result table. If a participant result is found in the list of equivalent results, assign 100 points. If a participant result is not one of the equivalent results, assign 0 points.
    8. If percent agreement (consensus) among referee labs is <80%, it is not acceptable to score participant results and all participants receive a score of 100% for the procedure/sample.
    9. Participants are penalized for reporting extraneous organisms that are known to be absent in the sample.

      E - Enumerated Scoring by Peer Method Group Back to table of contents

    1. For each analyte/sample/procedure, group all participant results according to peer method code (table of all possible method codes). Create a Participant Result Summary Table containing analyte/procedure, sample number, method code, result, frequency, percent and acceptable results.
    2. Count the total number of reported results and the frequency of results for each result.
    3. Calculate the percent for each result by dividing the frequency by the total number of results and multiplying by 100.
    4. For each peer method group calculate, select, label and store the mode (most frequently occurring) for result items. The mode indicates the correct result.
    5. If >=80% of the participant results are for the same one result (mode), it is valid to score all participant results based on peer method group results.
    6. If <80% of participants results are for the same one result, more than one result may be accepted as correct based on the judgement of the coordinator. The coordinator will select additional acceptable result codes or decide that it is not valid to score participant results based on peer group results. In the case where responses are expressed as dilutions, if the target value falls between two dilutions, the range becomes the greater dilution plus one dilution and the smaller dilution minus one dilution. If 80% peer group consensus is not achieved, all participants receive 100 points for the analyte/procedure/sample.
    7. Score all participant results for each analyte/procedure based on the selected acceptable results determined by the peer method results.
    8. Each correct answer is given 100 points. If the participant result is not among the acceptable results, it is given 0 points.
    9. For each participant reporting results, calculate the event score by dividing the summation of all sample points by the number of scored samples and multiplying by 100.

      F - Enumerated Scoring for Antibiotic Susceptibility Back to table of contents

    1. For each antibiotic code (in glossary) count the total number of results. If the total is <10, it is not valid to score the antibiotic.
    2. For each antibiotic code, if the total number of results is >=10, select the antibiotic for consideration in the scoring process.
    3. Calculate the frequency and percent of S (susceptible), I (intermediate) and R (resistant) results. Add the %S and %I together.
    4. If %R or %S+I >=80% the antibiotic is valid for scoring.
    5. If neither S+I nor R >=80%, it is not valid to score the antibiotic.
    6. Create a Table of Antibiotic Results containing antibiotic name, antibiotic code, result frequency (S+I+R), number of S, %S, number of I, %I, number of R, %R and indication of results judged acceptable and valid for scoring.
    7. For each participant, the list of antibiotics reported and the sensitivity or resistance to each is compared to the Table of Antibiotic Results. For each antibiotic reported that matches acceptable and valid antibiotics in the table, 100 points are assigned. If a participant’s antibiotic and sensitivity result differs from the table, 0 points are assigned.
    8. For each participant an event score for antibiotic susceptibility is calculated by dividing the total points for all valid antibiotics by the number of valid antibiotics.


    Cell Identification Scoring Criteria (Back to Referee General)

    1. Number of Challenges. The specialty will provide five morphologic challenges per testing event. There will be three testing events per year.
    2. Types of Challenges. Each event participants will be sent a combination of morphologic challenges that include moderately complex and highly complex skill levels. Challenges may be selected from normal, healthy individuals or may focus on a particular disease state in which normal and abnormal cells may be present. In these cases cells characteristic to a particular disease state will be selected. All challenges for an event will come from the same sample or donor whenever possible.
    3. Types of Responses. A glossary of acceptable responses will be provided to participants each event. This is a comprehensive list that will meet the needs of moderate and high complexity laboratories.

    Moderate Complexity Laboratories: A general knowledge of cellular elements in normal peripheral blood is required. Common atypical or immature blood cells such as atypical lymphs, bands and polychromatophilic erythrocytes should be identified. Common red blood cell morphology should also identified. The presence of uncommon atypical or immature cells (precursor cells, large or abnormal platelets, or extensive abnormal RBC morphology) needs to be recognized and referred.

    High Complexity Laboratories: A comprehensive knowledge of normal and abnormal/immature production in all cell lines is required. All distinctive morphological characteristics, both normal and abnormal, in all cell lines need to be identified. This category would include blast cells, prolymphocytes, plasma cells, red blood cells with Howell-Jolly bodies or other distinguishable inclusion bodies.
    4. Determination of Target Values. The criterion for satisfactory performance for cell identification is 80% or greater consensus on each morphologic challenge among 10 or more referee laboratories. Referee laboratories are selected based on satisfactory performance from the previous year and represent a cross section of our participant population. More than one correct identification for each morphologic challenge may be considered satisfactory if there is scientific justification.

    • Normal cell types (neutrophils, monocytes, basophils, eosinophils, lymphocytes, erythroid cells and platelets) will not be combined to determine an acceptable response.
    • Mature and immature cells will not be combined to determine an acceptable response.
    • Malignant and benign cells will not be combined to determine an acceptable response.
    • Abnormal findings critical to the diagnosis of a certain disease state (blasts, malignant cells, infectious agents or sickle cells) must be correctly identified as such and will not be grouped with other cell types.
    5. Scoring. Scoring of participant responses will be based on referee consensus if 80% consensus is reached. All responses that match the referee response for a particular challenge are given “satisfactory” status and all responses that do not match are given “unsatisfactory” status. If referee consensus for a particular challenge does not reach 80%, all responses are not scored and are deemed satisfactory.
    6. Determination of Analyte Score.
    Number of acceptable responses for analyte / Total number of challenges x 100 = Analyte score/testing event
    7. References. Federal Register, Vol. 57, No. 40, Section 493.941, p. 7159.
    Federal Register, Vol. 58, No. 141, Section, p. 39873.


     

    Feedback, questions or accessibility issues please contact Customer Service

     

    WSLH Proficiency Testing
    465 Henry Mall: Room 402
    Madison WI 53706-1578
    Phone: 1-608-265-1100
    Toll Free: 1-800-462-5261
    Fax: 1-608-265-1111

    Wisconsin State Laboratory of Hygiene
    Copyright © 2003 The Board of Regents of the University of Wisconsin System.
    Locations | Contact | Legal Notices | Acceptable Use | Privacy Policy