4 research outputs found
통신 로그에서 고도로 상호 연관된 통신 개체들의 발견
학위논문(박사) - 한국과학기술원 : 전산학과, 2011.2, [ viii, 69 p. ]Recently many countries including the U.S. and the EU are legally forcing their communication service providers to retain electronic communication records, often called \emph{communication log}, for a certain amount of time. These retained communication logs are being used to prevent, investigate, detect, or prosecute serious crimes by the law enforcement agencies (LEAs) such as police, FBI, etc. In general, the communication logs rarely include whole communication content owing to privacy or technical issues; i.e., only minimum information such as senders, receivers, dates and times, locations, etc. is stored in the logs. In particular, one-way communication logs often include a huge amount of spam entities or spammers, which send unsolicited or undesired messages to numerous recipients via electronic messaging systems. This is because spammers can indiscriminately send their spam messages to any recipients by using one-way communication services such as e-mail, SMS, etc. if they only know the address of the recipients.
In this dissertation, we propose \emph{score-based} and \emph{sequence-based} methods for finding highly interrelated communication entities from the one-way communication logs, even though the logs include many spam entities.
A Spam-Robust Proximity Scorer, the score-based method, discovers highly interrelated communication entities from the one-way communication log by measuring the proximity scores of normal communication entities with respect to the \emph{surveillance target communication entities} (or just shortly surveillance targets(\) such as criminals, suspects, etc. In other words, for the given surveillance targets, the communication entities that get high proximity scores by the method are likely to be highly interrelated with the surveillance targets. To measure the proximity scores, we derived a new formula considering several metrics such as the number of adjacent communication entities, the number of incident communications, an...한국과학기술원 : 전산학과
THE METHOD FOR PROCESSING DATA AND SYSTEM THEREOF
본 발명은, 통신 기록에서 복수의 사용자들이 연관되어 있는 대화형 통신 순서열(ICS)을 가변적인 타임 윈도우를 사용하여 파악하고, 파악된 대화형 통신 순서열 중 빈번히 발생하는 대화형 통신 순서열인 대화형 통신 순서열 패턴(ICSP)를 파악하는 데이터 처리 방법 및 시스템에 관한 것이다. 데이터 처리 방법은, (a) 통신 기록에 존재하는 인버스 페어를 대화형 통신 순서열 집합 또는 대화형 통신 순서열의 부분이 될 가능성이 있는 인버스 페어의 집합인 후보 집합에 저장하는 단계; (b) 상기 대화형 통신 순서열 집합에 포함되어 있는 대화형 통신 순서열들을 조합하여, 길이가 1이 아닌 대화형 통신 순서열을 생성하는 단계; 및 (c) 상기 후보집합에 포함되어 있는 인버스 페어와 상기 (a) 단계의 대화형 통신 순서열 집합에 포함되어 있는 대화형 통신 순서열 또는 상기 (b) 단계에서 생성된 대화형 통신 순서열을 조합하여, 길이가 1이 아닌 대화형 통신 순서열을 생성하는 단계를 포함한다
