63 research outputs found

    Online Packing to Minimize Area or Perimeter

    Get PDF
    We consider online packing problems where we get a stream of axis-parallel rectangles. The rectangles have to be placed in the plane without overlapping, and each rectangle must be placed without knowing the subsequent rectangles. The goal is to minimize the perimeter or the area of the axis-parallel bounding box of the rectangles. We either allow rotations by 90^? or translations only. For the perimeter version we give algorithms with an absolute competitive ratio slightly less than 4 when only translations are allowed and when rotations are also allowed. We then turn our attention to minimizing the area and show that the competitive ratio of any algorithm is at least ?(?n), where n is the number of rectangles in the stream, and this holds with and without rotations. We then present algorithms that match this bound in both cases and the competitive ratio is thus optimal to within a constant factor. We also show that the competitive ratio cannot be bounded as a function of Opt. We then consider two special cases. The first is when all the given rectangles have aspect ratios bounded by some constant. The particular variant where all the rectangles are squares and we want to minimize the area of the bounding square has been studied before and an algorithm with a competitive ratio of 8 has been given [Fekete and Hoffmann, Algorithmica, 2017]. We improve the analysis of the algorithm and show that the ratio is at most 6, which is tight. The second special case is when all edges have length at least 1. Here, the ?(?n) lower bound still holds, and we turn our attention to lower bounds depending on Opt. We show that any algorithm for the translational case has a competitive ratio of at least ?(?{Opt}). If rotations are allowed, we show a lower bound of ?(?{Opt}). For both versions, we give algorithms that match the respective lower bounds: With translations only, this is just the algorithm from the general case with competitive ratio O(?n) = O(?{Opt}). If rotations are allowed, we give an algorithm with competitive ratio O(min{?n,?{Opt}}), thus matching both lower bounds simultaneously

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum

    Dynamic Management of Distributed Machine Learning Problems

    Get PDF
    Machine Learning (ML) eInteligência Artificial(IA )são dois termos intimamente relacionados. A Inteligência Artificial é uma disciplina que busca criar máquinas que tenham a capacidade de imitar as habilidades cognitivas humanas, como aprendizagem, raciocínio, perceção, e tomada de decisão. Machine Learning é uma das técnicas de IA que permite às máquinas aprenderem a partir de dados sem serem explicitamente programa das. O crescimento exponencial dos dados nas últimas décadas tem sido um dos principais fatores impulsionadores do avanço da Inteligência Artificial e de MachineLearning. As empresas e organizações recolhem dados em volumes cada vez maiores, incluindo informações de transações financeiras, registos médicos, dados de sensoresIoTemuitomais.Essesdadossãocruciaisparaimpulsionarainovaçãoeo progresso, mas podem ser muito complexos e difíceis de ser em analisados manualmente. É aqui que entram MachineLearning, que permite que as máquinas aprendam e automatizem a análise de grandes conjuntos de dados.Machine Learning(ML)andArtificialIntelligence(AI)aretwocloselyrelatedterms.ArtificialIn- telligence isadisciplinethatseekstocreatemachinesthathavetheabilitytomimichumancognitive skills, suchaslearning,reasoning,perception,anddecisionmaking.MachineLearningisoneoftheAI techniques thatallowsmachinestolearnfromdatawithoutbeingexplicitlyprogrammed. The exponentialgrowthofdatainrecentdecadeshasbeenoneofthemaindrivingfactorsofAIand Machine Learningadvancement.Companiesandorganizationscollectdatainincreasinglylargevolumes, including financialtransactioninformation,medicalrecords,IoTsensordata,andmore.Thisdatais crucial fordrivinginnovationandprogress,butcanbetoocomplexanddifficulttoanalyzemanually. This iswhereMachineLearningcomesin,allowingmachinestolearnandautomatetheanalysisof largedatasets.Thisapproachreducesthetimeandeffortrequiredtoperformcomplexanalyses,aswell as providingvaluableinsightsthatcanbeusedtoimprovebusinessoperations,increaseefficiency,and makemoreinformeddecisions. As datacontinuestogrowinsizeandcomplexity,newapproachesandsystemsareneededto handle itefficiently.OnewaythisisbeingdoneisthroughthedevelopmentofmoreadvancedMachine Learning techniques,suchasdeepneuralnetworksandreinforcementlearningalgorithms,whichcan more effectivelyhandlelargerandmorecomplexdatasets.Inaddition,theuseoftechnologiessuchas cloud computinganddistributeddataprocessingcanalsohelpreducetheconsumptionofcomputational resources andmakedataanalysismorescalable. Thus, theproposedsolutionarisestoaddresssomeofthechallengesthathaveemergedwiththe increase indatavolume.AdistributedmachinelearningsystemthatrunsonaHadoopclusterand takesadvantageofreplication,balancing,andblockdistributioncapabilities.Itallowsmodelstobe trained inadistributedmannerfollowingtheprincipleofdatalocality,beingabletochangepartsof the modelthroughanoptimizationmodule,thusenablingthemodeltoevolveovertimeasnewdataarrive

    Collected Papers, Vol. 2

    Get PDF

    A graph theoretical perspective for the unsupervised clustering of free text corpora

    Get PDF
    This thesis introduces a robust end to end topic discovery framework that extracts a set of coherent topics stemming intrinsically from document similarities. Some topic clustering methods can support embedded vectors instead of traditional Bag-of-Words (BoW) representation. Some can be free from the number of topics hyperparameter and some others can extract a multi-scale relation between topics. However, no topic clustering method supports all these properties together. This thesis focuses on this gap in the literature by designing a framework that supports any type of document-level features especially the embedded vectors. This framework does not require any uninformed decision making about the underlying data such as the number of topics, instead, the framework extracts topics in multiple resolutions. To achieve this goal, we combine existing methods from natural language processing (NLP) for feature generation and graph theory, first for graph construction based on semantic document similarities, then for graph partitioning to extract corresponding topics in multiple resolutions. Finally, we use specific methods from statistical machine learning to obtain highly generalisable supervised models to deploy topic classifiers for the deployment of topic extraction in real-time. Our applications on both a noisy and specialised corpus of medical records (i.e., descriptions for patient incidents within the NHS) and public news articles in daily language show that our framework extracts coherent topics that have better quantitative benchmark scores than other methods in most cases. The resulting multi-scale topics in both applications enable us to capture specific details more easily and choose the relevant resolutions for the specific objective. This study contributes to topic clustering literature by introducing a novel graph theoretical perspective that provides a combination of new properties. These properties are multiple resolutions, independence from uninformed decisions about the corpus, and usage of recent NLP features, such as vector embeddings.Open Acces

    Programming Languages and Systems

    Get PDF
    This open access book constitutes the proceedings of the 29th European Symposium on Programming, ESOP 2020, which was planned to take place in Dublin, Ireland, in April 2020, as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020. The actual ETAPS 2020 meeting was postponed due to the Corona pandemic. The papers deal with fundamental issues in the specification, design, analysis, and implementation of programming languages and systems

    Autotuning for Automatic Parallelization on Heterogeneous Systems

    Get PDF

    Semantic discovery and reuse of business process patterns

    Get PDF
    Patterns currently play an important role in modern information systems (IS) development and their use has mainly been restricted to the design and implementation phases of the development lifecycle. Given the increasing significance of business modelling in IS development, patterns have the potential of providing a viable solution for promoting reusability of recurrent generalized models in the very early stages of development. As a statement of research-in-progress this paper focuses on business process patterns and proposes an initial methodological framework for the discovery and reuse of business process patterns within the IS development lifecycle. The framework borrows ideas from the domain engineering literature and proposes the use of semantics to drive both the discovery of patterns as well as their reuse

    Scene Understanding For Real Time Processing Of Queries Over Big Data Streaming Video

    Get PDF
    With heightened security concerns across the globe and the increasing need to monitor, preserve and protect infrastructure and public spaces to ensure proper operation, quality assurance and safety, numerous video cameras have been deployed. Accordingly, they also need to be monitored effectively and efficiently. However, relying on human operators to constantly monitor all the video streams is not scalable or cost effective. Humans can become subjective, fatigued, even exhibit bias and it is difficult to maintain high levels of vigilance when capturing, searching and recognizing events that occur infrequently or in isolation. These limitations are addressed in the Live Video Database Management System (LVDBMS), a framework for managing and processing live motion imagery data. It enables rapid development of video surveillance software much like traditional database applications are developed today. Such developed video stream processing applications and ad hoc queries are able to reuse advanced image processing techniques that have been developed. This results in lower software development and maintenance costs. Furthermore, the LVDBMS can be intensively tested to ensure consistent quality across all associated video database applications. Its intrinsic privacy framework facilitates a formalized approach to the specification and enforcement of verifiable privacy policies. This is an important step towards enabling a general privacy certification for video surveillance systems by leveraging a standardized privacy specification language. With the potential to impact many important fields ranging from security and assembly line monitoring to wildlife studies and the environment, the broader impact of this work is clear. The privacy framework protects the general public from abusive use of surveillance technology; iii success in addressing the trust issue will enable many new surveillance-related applications. Although this research focuses on video surveillance, the proposed framework has the potential to support many video-based analytical applications

    Scalable and Reliable Middlebox Deployment

    Get PDF
    Middleboxes are pervasive in modern computer networks providing functionalities beyond mere packet forwarding. Load balancers, intrusion detection systems, and network address translators are typical examples of middleboxes. Despite their benefits, middleboxes come with several challenges with respect to their scalability and reliability. The goal of this thesis is to devise middlebox deployment solutions that are cost effective, scalable, and fault tolerant. The thesis includes three main contributions: First, distributed service function chaining with multiple instances of a middlebox deployed on different physical servers to optimize resource usage; Second, Constellation, a geo-distributed middlebox framework enabling a middlebox application to operate with high performance across wide area networks; Third, a fault tolerant service function chaining system
    corecore