6 research outputs found

    Use Text Mining Approach to Generate the Draft of Indictment for Prosecutor

    Get PDF
    Motivation: The quantity of criminal cases year 2009 in Taiwan is up to 1.8 millions, Each prosecutor must handle over 211 cases per month, complaints on over loading is laud and clear. While 70 % of criminal cases are drug Abuse, public danger, larceny and fraud, these types of criminal cases may have different story though, the complexity are relative simple than cases of killing, corruption etc., but prosecutors still spend costly time on these cases handling. In this paper we try to use text mining technology to provide solution on this issue. Approach: We use the police’s investigation document of criminal case to compare with judgment history of court, and use Cosine Similarity algorithm to calculate coefficient of similarity, base on the highest coefficient, we find the closest judgment of this type of criminal case, that can be used to decide and generate the draft of indictment for prosecutor

    The feasibility of applying Latent Semantic Analysis to analyze Item similarity

    Get PDF
    [[abstract]]The purpose of this study is to apply latent semantic analysis (LSA) to analyze item similarity , and discuss the result of using different score function. The feature of LSA model is “Lexically Co-occur” detection , in other words, LSA model can analyze many documents, and find synonyms , but synonyms rarely exist in the same item , so LSA model needs to be trained by documents which are related to this item . This study revealed that the result using dice measure or inner product measure correlates more closely with expert’s scores. For the items which is more agreeable of expert’s scores than others , the maximum correlation is up to 0.9, and the mean of correlation is up to 0.7, so applying latent semantic analysis to analyze item similarity is a feasible technology.

    Unknown word extraction for Chinese documents

    No full text
    corecore