语言测试分数导出、报道和解释:对TEM一些倡议

更新时间:2024-03-29 点赞:6542 浏览:19675 作者:用户投稿原创标记本站原创

英语专业四、八级考试是一项大规模的把英语作为外国语的国家级考试。考生范围涉及中华人民共和国大陆的所有全日制外语专业学生。由于社会对考试结果的认可,这项考试实质上已经成为一项高风险的外部性的综合性语言考试。可是,这项考试一直采用课堂考试中常用的原始分数进行直接分数合成和分数报道。这样,无论是不同次考试成绩间的可比性,还是同一次考试成绩的可解释性都受到限制。为了进一步提高这项考试结果解释的效度,改善考试结果的可用性,挖掘考试结果的使用价值,本文就语言测试分数的导出、报道和解释,对四、八级考试摸索性地提出了几点建议。建议的主导思想是:四、八级考试是一项公共考试,是属于人民的考试,因此,它应该接受人民的监视,从而为人民提供更多、更好的服务。建议分两个层面:一般层面和技术层面。前者针对四、八级考试的有关决策部分或机构,后者针对四、八级考试的有关技术职员。本文在一般层面上的建议共六条,涉及四、八级考试的(1)目的与意图的进一步确定,(2)测试维度与测试方法的约定,(3)分数报道总体政策的确定,(4)证书的颁发,(5)分数解释职员的培训以及(6)专门网站的建立。本文在技术层面上的建议共三大条,涉及四、八级考试的(1)分标的选择,(2)结果的报道以及(3)分数的解释。关于分标选择,本文建议四、八级考试建立标志自己身份的独立分标,且建议四、八级考试采用同一个分标,分标区间可取0~1000。为了既便于分数解释又尽量满足不同使用者的需要,本文建议四、八级考试采用主分标和副分标制;为了提高分数的可解释性,本文建议四、八级考试除报道原来的原始分数之外,还报道项目标准分、百分等级分、年级当量分和四、八级分标分;为了便于分项分数的比较,本文建议四、八级考试采用欣赏|语文教学论文|性分数分标作为主分标。关于分数报道,本文为四、八级考试提供了分数报道册(单)的设计蓝图,并建议四、八级考试既报道分数,也报道分数的信度、不确定度以及常模,以提高分数的可用性,并就此为四、八级考试各设计了一份蓝图。关于分数解释,本文建议公然四、八级考试的评分标准和说明。为了既便于分数解释,又便于提供更多的信息、防止考试结果的误用和滥用,本文还建议编写四、八级考试结果的使用指南。【关键词】:教育计量学分标分标分化分数欣赏|语文教学论文|化语言测试专业英语考试
【论文提纲】:Acknowledgements3-5摘要5-8Abstract8-13Contents13-19ListofTables19-21ListofFigures21-22ListofAcronymsandAbbreviations22-23Chapter1Introduction23-511.1NoticingtheSignificance24-391.1.1TestScoresandLanguageRelatedResearch24-271.1.2TestScoresandDecisionMakinginEducationalPrograms27-361.1.2.1SelectionandTestScoreorSelectionDecisions28-301.1.2.2PlacementandTestScoreorPlacementDecisions30-321.1.2.3DiagnosisandTestScoreorDiagnosticDecisions32-331.1.2.4TestScoresandProgramEvaluation33-351.1.2.5MinimumCompetenceandTestScoresinMinimumCompetenceDecisions35-361.1.3TestScoresandtheReliabilityandUncertaintyofTestResults36-391.2IdentifyingtheObjectforResearch39-471.2.1IdentifyingSomeTheoreticalProblems40-451.2.1.1TheImportantandtheNeglected40-441.2.1.2TheConflictsbetweenTheories44-451.2.2IdentifyingSomePracticalNeeds45-471.2.2.1TheBleakPictureofTestingPracticeinChina45-461.2.2.2TheHopefulFutureinChina’sTestingPractice46-471.3OverviewoftheDissertation47-501.3.1PurposeandScoreoftheStudy47-481.3.2StudyQuestions481.3.3OverviewoftheDissertation48-501.4Summary50-51Chapter2TypesofLanguageTests51-912.1LanguageTests:Norm-ReferencedandCriterion-Referenced51-632.1.1Norm-ReferencedTests52-562.1.1.1TheOriginandTypesofNorm-Referencing52-542.1.1.2TheDistinctiveFeaturesofaNorm-ReferencedTest54-552.1.1.urposesandScoreorNorm-Referencing55-562.1.2Criterion-ReferencedTests56-632.1.2.1TheOriginandTypesofCriterion-Referencing56-592.1.2.2TheDistinctiveFeaturesofaCriterion-ReferencedTest59-602.1.2.urposesofandScoreorCriterion-Referencing60-632.2LanguageTests:PowerandSpeed63-662.2.1PowerTests63-652.2.1.1DefinitionandDesignFeatures63-642.2.1.2PurposeandScoreforaPowerTest64-652.2.2SpeedTests65-662.2.2.1DefinitionandDesignFeatures65-662.2.2.2PurposeofandScoreforaSpeedTest662.3LanguageTests:MentalPowerandMentalWork66-712.3.1TestsofMentalPower67-682.3.1.1DefinitionandDesignFeatures67-682.3.1.2PurposeofandScoreforaTestofMentalPower682.3.2TestsofMentalWork68-712.3.2.1DefinitionandDesignFeatures68-692.3.2.2PurposeofandScoreforaTestofMentalWork69-712.4LanguageTests;ExtensiveandIntensive71-772.4.1ExtensiveTests71-742.4.1.1DefinitionandDesignFeatures71-732.4.1.2PurposeofandScoreforaTestofExtensiveQuantity73-742.4.2IntensiveTests74-772.4.2.1DefinitionandDesignFeatures74-752.4.2.2PurposeofandScoreforaTestofIntensiveQuantity75-772.5LanguageTests:WeaknessBasedandStrengthBased77-802.5.1DefinitionandDesignFeaturesoftheWeaknessBasedTests78-792.5.2PurposesofandScoreforaWeaknessBasedTest79-802.6LanguageTests:Nominal,Ordinal,Interval,andRatio80-902.6.1TestsattheNominalLevelofMeasurement82-832.6.1.1Definition82-832.6.1.2PropertyoftheScale,StatisticsAllowedandCommonMistakesorMisbelieves832.6.2.TestsattheOrdinalLevelofMeasurement83-852.6.2.1Definition83-842.6.2.2PropertyoftheScale,StatisticsAllowedandCommonMistakesorMisbelieves84-852.6.3.TestsattheIntervalLevelofMeasurement85-872.6.3.1Definition852.6.3.2PropertyoftheScale,StatisticsAllowedandCommonMistakesorMisbelieves85-872.6.4.TestsattheRatioLevelofMeasurement87-902.6.4.1Definition87-882.6.4.2PropertyoftheScale,StatisticsAllowedandCommonMistakesorMisbelieves88-902.7Summary90-91Chapter3TheDerivationofScoreorLanguageTests91-1573.1Scale,Scaling,ScoreandScoring92-993.1.1Scale92-943.1.2Scaling94-963.1.3Score96-973.1.4Scoring97-993.2SomeFrequentlyUsedScoreScales:aCriticalReview99-1353.2.1TheRawScoreScale100-1083.2.1.1DefinitionandIllustration100-1033.2.1.2Application(s)103-1063.2.1.3EvaluatingtheScale106-1083.2.2ThePercentileRankScoreScale108-1123.2.2.1DefinitionandIllustration109-1103.2.2.2Application(s)110-1113.2.2.3EvaluatingtheScale111-1123.2.3TheStandardScoreScale112-1243.2.3.1DefinitionandIllustration113-1223.2.3.2Application(s)122-1233.2.3.3EvaluatingtheScale123-1243.2.4TheGradeEquivalentScoreScale124-1283.2.4.1DefinitionandIllustration125-1263.2.4.2Application(s)126-1273.2.4.3EvaluatingtheScale127-1283.2.5TheLatentTraitScoreScale128-1353.2.5.1DefinitionandIllustration128-1313.2.5.2Application(s)131-1323.2.5.3EvaluatingtheScale132-1353.3TheStandardizedItem-BasedScoreScale135-1473.3.1DefinitionandIllustration136-1413.3.2Application(s)141-1433.3.3EvaluatingtheModels143-1473.4ThreeModelorScoring147-1563.4.1LimitationsofConventionalScoringModels148-1493.4.2FundamentalConsiderationsofScoringModels149-1523.4.3ThreeScoringModels152-1563.4.3.1ThePowerScoringModels152-1543.4.3.2TheLogisticScoringModel154-1553.4.3.3StandardUncertaintyoftheGeneratedScores1553.4.3.4SomeGeneralSuggestions155-1563.5Summary156-157Chapter4TheReportingofLanguageTestScores157-1914.1SomeGeneralConsiderationsofScoreReporting158-1764.1.1ThePurposesofTesting159-1614.1.1.1ThePrimaryPurposesofTesting159-1604.1.1.2TheSecondaryPurposesofTesting160-1614.1.2TheAnticipatedUsersofTestResults161-1654.1.2.1TheNon-qualifiedUsers162-1634.1.2.2TheLess-qualifiedUsers163-1644.1.2.3TheWell-qualifiedUsers164-1654.1.3InformationontheScoreReportandInformationReservedfortheSupportingDocuments.165-1764.1.3.1InformationontheScoreReport166-1704.1.3.2WhattoBeProvidedintheSupportingDocuments170-1764.2SomeTechnicalConsiderationsofScoreReporting176-1904.2.1TrueScore,ItsEstimateandtheUncertaintyoftheEstimate176-1854.2.1.1TheTrueScore176-1774.2.1.2TheEstimatesofTrueScores177-1784.2.1.3TheUncertaintyofanEstimate:ItsEvaluationandExpression178-1844.2.1.4TheCorrectionforGuessing184-1854.2.2TheReliabilityofTestScores185-1904.2.2.1TheStabilityofScores185-1864.2.2.2TheParallelFormReliability186-1874.2.2.3TheGeneralizabilityofObservedScoresovertheItemUniverse187-1894.2.2.4TheGeneralizabilityofObservedScoresovertheRaterUniverse1894.2.2.5TheGeneralizabilityofObservedScoresoverBoththeItemandtheRaterUniverse189-1904.3Summary190-191Chapter5TheInterpretationofLanguageTestScores191-2285.1ValidityandScoreInterpretation192-2015.1.1TheEvolvingConceptofValidity192-1995.1.1.1ValidityasTest-CriterionCorrelation193-1945.1.1.2ValidityasConsistingofDifferentTypes194-1975.1.1.3ValidityasaUnitaryConcept197-1995.1.2ValidityastheAppropriatenessofScoreInterpretation199-2015.2NormsandNorm-ReferencedScoreInterpretation201-2185.2.1NormsandNorming202-2105.2.1.1Norms,NormGroupsandtheCriteriaforNorms202-2035.2.1.2ClassificationofNorms203-2105.2.2InterpretingTestScoresbyReferencingtotheNorms210-2185.2.2.1InterpretingTestScoresbyReferencingtothePercentileRankNorms210-2155.2.2.2InterpretingTestScoresbyReferencingtotheGroupAverageNorm.215-2175.2.2.3SummaryoftheSection217-2185.3CriterionandCriterion-ReferencedScoreInterpretation218-2265.3.1TheCriterion218-2215.3.1.1CriterionasMasteryofDomainKnowledge2195.3.1.2CriterionasPerformanceonTargetTasks219-2205.3.1.3CriterionasProficiencyinRelationtoFutureNeeds220-2215.3.2Criterion-ReferencedScoreInterpretation221-2265.3.2.1InterpretingtheCriterionScorebyReferencingtotheCutScore(s)222-2235.3.2.2InterpretingtheCriterionScorebyReferencingtotheExpectancyTable223-2245.3.2.3InterpretingtheCriterionScorebyReferencingtoProficiencyDescriptors224-2255.3.2.4InterpretingtheCriterionScorebyReferencingtotheScoringStandards225-2265.4Summary226-228Chapter6AnalyzingtheTEM228-2626.1BackgroundInformation228-2426.1.1GeneralBackgroundInformation228-2296.1.2ABriefHistoryofTEM229-2306.1.2.1ABriefHistoryofTEM42296.1.2.2ABriefHistoryofTEM8229-2306.1.3TheGrowingPopulationofTEM230-2346.1.3.1TheGrowingPopulationofTEM4230-2326.1.3.2TheGrowingPopulationofTEM8232-2346.1.4TheChangingFormatsofTEM234-2426.1.4.1TheChangingFormatsofTEM4234-2396.1.4.2TheChangingFormatsofTEM8239-2426.2AnalyzingtheStructureoftheTEMTest242-2556.2.1TheSemanticStructureoftheE-TEM4242-2496.2.1.1TheSurfaceStructure242-2446.2.1.2TheDeepStructure244-2496.2.2TheStructureoftheNewGenerationTEM4249-2516.2.3TheStructureofTEM8251-2556.2.3.1TheSurfaceStructureofTEM8251-2536.2.3.2TheDeepStructureofTEM8253-2556.3TheTEMScoringPracticeandtheTEMCertificates255-2616.3.1MarkingtheTEMTests255-2576.3.1.1Machine-markingtheMultipleChoiceQuestions2556.3.1.2Hand-markingtheConstructedResponseQuestions255-2576.3.2ReportingtheTEMResult257-2596.3.2.1ReportingtheTEMScoreattheIndividualLevel2586.3.2.2ReportingtheTEMScoreattheInstitutionalLevel258-2596.3.3GrantingtheTEMCertificates259-2616.4Summary261-262Chapter7SomeRecommendationortheTEM262-2827.1GeneralRecommendationorTEM263-2687.1.1PurposesandIntendedUsesofTEM263-2647.1.2.DimensionalityandTestingMethodsofTEM264-2657.1.3RawScoreorScaleScore?SkillScoresorTotalScore?2657.1.4TheTEMCertificates265-2677.1.5TrainingScoreInterpreters2677.1.6BuildinganOfficialWebsitefortheTEM267-2687.2TechnicalRecommendationorTEM268-2817.2.1ScoringTEM268-2737.2.1.1TheDimensionalityofTEM268-2697.2.1.2ScoreScales269-2737.2.2ReportingTEMResult273-2777.2.2.1TheTEMScoreReport273-2757.2.2.2TheUncertaintyandNormativeInformationofTEM275-2777.2.3InterpretingTEMScores277-2817.2.3.1Descriptors278-2807.2.3.2Users’Guide280-2817.3Summary281-282Chapter8ConcludingRemarks282-2898.1MajorContributions282-2858.1.1TheoreticalContributions282-2848.1.2PracticalContributions284-2858.2LimitationsandSuggestionorFurtherResearch285-2878.3Summary287-289Bibliography289-300
相关文章
推荐阅读

 发表评论

共有3000条评论 快来参与吧~