14 research outputs found

    Privatheit und Datenschutz in der intelligenten Überwachung: Ein datenschutzgewährendes System, entworfen nach dem ""Privacy by Design"" Prinzip

    Get PDF
    Überwachungssysteme haben sich in den letzten Jahren zu intelligenten Anlagen entwickelt. Sie erzeugen eine große Menge von sensiblen Informationen. Die rechtlichen Datenschutzgrundlagen für diese Systeme werden in dieser Arbeit herausgearbeitet. Es wird der technische Aufbau eines Überwachungssystems nach Privacy by Design aufgezeigt, das geringer in die in Privatsphäre der Betroffenen eingreift als konventionelle Systeme und dabei die technischen Vorteile intelligenter Verarbeitung bietet

    Privatheit und Datenschutz in der intelligenten Überwachung: Ein datenschutzgewährendes System, entworfen nach dem "Privacy by Design" Prinzip

    Get PDF
    Überwachungssysteme haben sich in den letzten Jahren zu intelligenten Anlagen entwickelt. Sie erzeugen eine große Menge von sensiblen Informationen. Die rechtlichen Datenschutzgrundlagen für diese Systeme werden in dieser Arbeit herausgearbeitet. Es wird der technische Aufbau eines Überwachungssystems nach Privacy by Design aufgezeigt, das geringer in die in Privatsphäre der Betroffenen eingreift als konventionelle Systeme und dabei die technischen Vorteile intelligenter Verarbeitung bietet

    A Cost-Effective Method to Prevent Data Exfiltration from LLM Prompt Responses

    Get PDF
    Large language models (LLMs) are susceptible to security risks wherein malicious attackers can manipulate LLMs by poisoning their training data or using malicious text prompts or queries designed to cause the LLM to return output that includes sensitive or confidential information, e.g., that is part of the LLM training dataset. This disclosure describes the use of a data loss prevention (DLP) system to protect LLMs against data exfiltration. The DLP system can be configured to detect specific data types that are to be prevented from being leaked. The LLM output, generated in response to a query from an application or user, is passed through the DLP system which generates a risk score for the LLM output. If the risk score is above a predefined threshold, the LLM output is provided to an additional pre-trained model that has been trained to detect sensitive or confidential data. The output is modified to block, mask, redact, or otherwise remove the sensitive data. The modified output is provided to the application or user. In certain cases, the output may indicate that no response can be provided due to a policy violation

    AI-based Adaptive Load Balancer for Secure Access to Large Language Models

    Get PDF
    Different large language models (LLMs) specialized to domains such as writing code, engaging in conversations, generating content, etc. are available. A specialized LLM can only reliably answer questions in domains over which it has been trained. Large numbers of types of specialized LLMs can make it difficult for a user, such as an application that generates LLM queries, to choose the right type of LLM. This disclosure describes techniques to automatically route query payloads between large language models specialized for different domains. The techniques utilize a vector database to semantically match an LLM to a user query. The techniques also provide a real-time feedback and adaptation mechanism. Security checks and access controls are applied in a centralized manner while adhering to security compliance regimes. The techniques provide improved end-to-end security posture of AI-based applications and user experience. The techniques can also reduce the costs of querying large LLMs

    Automatically Detecting Expensive Prompts and Configuring Firewall Rules to Mitigate Denial of Service Attacks on Large Language Models

    Get PDF
    Denial of service attacks on generative artificial intelligence systems, e.g., large language models (LLMs), can include sending LLMs requests that include expensive prompts designed to consume computing resources and degrade model performance. This disclosure describes techniques to automatically detect such prompts and then configure firewall rules that prevent such prompts in subsequent requests from reaching the LLM. Per the techniques, prompts provided to an LLM are matched against input and output token size as well as resource utilization to identify prompts that deviate significantly from a baseline. Expensive prompts are identified, and semantically similar prompts are automatically generated using the same LLM or another model. A subset of the generated prompts that are semantics similar to expensive prompts are identified by comparing respective vector embeddings. The subset of prompts and the received expensive prompts are provided to a pre-trained LLM that generates firewall rules, e.g., web application firewall (WAF) rules. Incoming requests from applications are evaluated based on the rules, and expensive prompts are blocked from reaching the LLM or are rate-limited

    Tenant Data Security for LLM Applications in Multi-Tenancy Environment

    Get PDF
    Large language models (LLMs) and other types of generative artificial intelligence can be used in a wide variety of business applications. However, there is a possibility of data leakage from LLM responses when an LLM is used in shared multi-tenant environments where each tenant has respective private datasets. Deploying individual adapter layers for each tenant can provide data isolation. However, such implementations can be complex and costly. This disclosure describes techniques to create and maintain a single model that can serve multiple tenants, with security controls for multi-tenancy services to isolate customer data efficiently. Data for different tenants is signed with their respective tenant-specific keys and is then appended with the tenant-specific signature prior to training/tuning a model or use by the model at inference time. When a business application of a particular tenant requests a response from the LLM, the response is generated using the adapter layer. The response includes data citations that are verified prior to the response being provided to the business application. The verification is based on the tenant-specific signature in the citation to ensure that only data that belongs to the particular tenant that requested the response is included

    Virtual Machine Images Preconfigured with Security Scripts for Data Protection and Alerting

    Get PDF
    Developers use interactive development environments (IDEs) to create and share documents that contain live code, equations, visualizations, narrative text, etc. as part of the artificial intelligence/ machine learning (AI/ML) development process. Virtual machines that run IDEs may have access to private and/or sensitive data used during model training or use. For data security and compliance, it is necessary to highlight and track the VMs that have been in contact with sensitive information. This disclosure describes techniques to automatically identify and label the presence of sensitive data in virtual machines and disks as part of machine learning. Custom VM images are provided that include data scanning scripts that can identify the presence of sensitive data during or after usage, e.g., by a developer using an IDE. The scripts can automatically log the presence of data and generate alerts. Users of such virtual machines are provided additional controls to perform the training process in a secure and confidential manner in compliance with applicable data regulations

    Privacy-aware access control for video data in intelligent surveillance systems

    No full text
    Surveillance systems became powerful. Objects can be identied and intelligent surveillance services can generate events when a specific situation occurs. Such surveillance services can be organized in a Service Oriented Architecture (SOA) to fulfill surveillance tasks for specific purposes. Therefore the services process information on a high level, e.g., just the position of an object. Video data is still required to visualize a situation to an operator and is required as evidence in court. Processing of personal related and sensitive information threatens privacy. To protect the user and to be compliant with legal requirements it must be ensured that sensitive information can only be processed for a defined propose by specific users or services. This work proposes an architecture for Access Control that enforces the separation of data between different surveillance tasks. Access controls are enforced at different levels: for the users starting the tasks, for the services within the tasks processing data stored in central store or calculated by other services and for sensor related services that extract information out of the raw data and provide them

    Access controls for privacy protection in pervasive environments

    No full text
    Pervasive Environments (PE) collect and process a massive amount of person-related and sensitive information. Data collected by a single sensor is in most cases not adequate to provide premium services. Information gathered must rather be combined to offer real benefits. The fused data must be secured by access controls to ensure privacy of the users and their trust in PE with it. This work proposes an Object-oriented World Model (OOWM) as a central information source that is filled with information collected from intelligent sensors and can be accessed and manipulated by smart application devices. It is shown how privacy can be enforced in such a centralized component. Privacy requirements must be specified and enforced. Especially conflicts in different requirements, e. g., user- and operator-specific polices, is an open issue. Existing approaches for specification and enforcement of access controls are discussed. An XACML-based approach for privacy in PE is shown and analgorithm for combining privacy policies is presented

    Anonymization in intelligent surveillance systems

    No full text
    Modern surveillance systems collect a massive amount of data. In contrast to conventional systems that store raw sensor material, modern systems take advantage of smart sensors and improvements in image processing. They extract relevant information about the observed objects of interest, which is then stored and processed during the surveillance process. Such high-level information is, e.g., used for situation analysis and can be processed in different surveillance tasks. Modern systems have become powerful, can potentially collect all kind of user information and make it available to any surveillance task. Hence, direct access to the collected high-level data must be prevented. Multiple approaches for anonymization exist, but they do not consider the special requirements of surveillance tasks. This work examines and evaluates existing metrics for anonymization and approaches for anonymization. Even though all kinds of data can be collected, position data is still the one with the highest demand. Hence, this work focuses on the anonymization of position data and proposes an algorithm that fulfills the requirements for anonymization in surveillance
    corecore