3,284 research outputs found
Splitwise: Efficient generative LLM inference using phase splitting
Recent innovations in generative large language models (LLMs) have made their
applications and use-cases ubiquitous. This has led to large-scale deployments
of these models, using complex, expensive, and power-hungry AI accelerators,
most commonly GPUs. These developments make LLM inference efficiency an
important challenge. Based on our extensive characterization, we find that
there are two main phases during an LLM inference request: a compute-intensive
prompt computation, and a memory-intensive token generation, each with distinct
latency, throughput, memory, and power characteristics. Despite
state-of-the-art batching and scheduling, the token generation phase
underutilizes compute resources. Specifically, unlike compute-intensive prompt
computation phases, token generation phases do not require the compute
capability of the latest GPUs, and can be run with lower power and cost.
With Splitwise, we propose splitting the two phases of a LLM inference
request on to separate machines. This allows us to use hardware that is
well-suited for each phase, and provision resources independently per phase.
However, splitting an inference request across machines requires state transfer
from the machine running prompt computation over to the machine generating
tokens. We implement and optimize this state transfer using the fast back-plane
interconnects available in today's GPU clusters.
We use the Splitwise technique to design LLM inference clusters using the
same or different types of machines for the prompt computation and token
generation phases. Our clusters are optimized for three key objectives:
throughput, cost, and power. In particular, we show that we can achieve 1.4x
higher throughput at 20% lower cost than current designs. Alternatively, we can
achieve 2.35x more throughput with the same cost and power budgets.Comment: 12 pages, 19 figure
POLCA: Power Oversubscription in LLM Cloud Providers
Recent innovation in large language models (LLMs), and their myriad use-cases
have rapidly driven up the compute capacity demand for datacenter GPUs. Several
cloud providers and other enterprises have made substantial plans of growth in
their datacenters to support these new workloads. One of the key bottleneck
resources in datacenters is power, and given the increasing model sizes of
LLMs, they are becoming increasingly power intensive. In this paper, we show
that there is a significant opportunity to oversubscribe power in LLM clusters.
Power oversubscription improves the power efficiency of these datacenters,
allowing more deployable servers per datacenter, and reduces the deployment
time, since building new datacenters is slow.
We extensively characterize the power consumption patterns of a variety of
LLMs and their configurations. We identify the differences between the
inference and training power consumption patterns. Based on our analysis of
these LLMs, we claim that the average and peak power utilization in LLM
clusters for inference should not be very high. Our deductions align with the
data from production LLM clusters, revealing that inference workloads offer
substantial headroom for power oversubscription. However, the stringent set of
telemetry and controls that GPUs offer in a virtualized environment, makes it
challenging to have a reliable and robust power oversubscription mechanism.
We propose POLCA, our framework for power oversubscription that is robust,
reliable, and readily deployable for GPU clusters. Using open-source models to
replicate the power patterns observed in production, we simulate POLCA and
demonstrate that we can deploy 30% more servers in the same GPU cluster for
inference, with minimal performance los
A comparative approach of tumor associated inflammation in mammary cancer between humans and dogs
Infiltrating cells of the immune system are widely accepted to be generic constituents of tumor microenvironment. It has been well established that the development of mammary cancer, both in humans and dogs, is associated with alterations in numbers and functions of immune cells at the sites of tumor progression. These tumor infiltrating immune cells seems to exhibit exclusive phenotypic and functional characteristics and mammary cancer
cells can take advantage of signaling molecules released by them. Cancer related inflammation has an important role in mammary carcinogenesis, contributing to the acquisition of core hallmark capabilities that allow cancer cells to survive, proliferate, and disseminate. Indeed, recent studies in human breast cancer and in canine mammary tumors have identified a growing list of signaling molecules released by inflammatory
cells that serve as effectors of their tumor-promoting actions. These include the COX-2, the tumor growth factor EGF, the angiogenic growth factor VEGF, other proangiogenic factors and a large variety of chemokines and cytokines that amplify the inflammatory state.
This review describes the intertwined signaling pathways shared by Tlymphocytic/macrophage infiltrates and important tissue biomarkers in both human and dog mammary carcinogenesis.The work was supported partially by the Strategic Research project Pest-OE/AGR/UI0772/2011 and the Research Project UID/AGR/04033/2013, by a Ph.D. scholarship SFRH/BD/ 78771/2011 financed by the Portuguese Foundation for Science and Technology (FCT), and in part by the Austrian Science Fund (FWF), SFB F4606-B28, to Erika Jensen Jarolim
PAPEL DE UM TUTOR EM UM CURSO A DISTÂNCIA PARA ADOLESCENTES
Quando de trata de educação a distância, muitas das teorias tratam da androgenia, do adulto fazendo um curso a distância e suas características. Esse artigo traz uma perspectiva diferente, um curso a distância voltado a adolescentes. Neste sentido, este artigo tem como objetivo de analisar o papel do tutor em um curso a distância para adolescentes. Quanto aos procedimentos metodológicos, caracteriza-se como teóricoempírico, aplicado, estudo de caso, participante e qualitativo. Dentre os resultados analisados, pode-se perceber que há algumas diferenças no papel do tutor, como por exemplo em relação as questões tecnológicas, não se percebe como um fator complicativo, há que o aluno/adolescente tem mais facilidades com as tecnologias. Pode—se verificar que o ambiente virtual utilizado deve apresentar ferramentas parecidas com as redes sociais utilizadas por esse público. Em contra-partida, a desmotivação ao curso é uma tendência eminente, já que muitos adolescentes não conseguem vislumbrar quais são os benefícios que o curso pode ter para o seu futuro, caracterizando uma visão mais imediatista. Assim, o tutor deve trabalhar muito mais no fator motivacional do que no fator de mediação das tecnologias
Does the number of implants have any relation with peri-implant disease?
Objective: The aim of this study was to evaluate the relationship between the number of pillar implants of implant-supported fixed prostheses and the prevalence of periimplant disease. Material and Methods: Clinical and radiographic data were obtained for the evaluation. The sample consisted of 32 patients with implant-supported fixed prostheses in function for at least one year. A total of 161 implants were evaluated. Two groups were formed according to the number of implants: G1) ≤5 implants and G2) >;5 implants. Data collection included modified plaque index (MPi), bleeding on probing (BOP), probing depth (PD), width of keratinized mucosa (KM) and radiographic bone loss (BL). Clinical and radiographic data were grouped for each implant in order to conduct the diagnosis of mucositis or peri-implantitis. Results: Clinical parameters were compared between groups using Student’s t test for numeric variables (KM, PD and BL) and Mann-Whitney test for categorical variables (MPi and BOP). KM and BL showed statistically significant differences between both groups (
As contribuições da nova Sudene para o desenvolvimento do Nordeste
Este artigo faz uma avaliação preliminar das contribuições da nova Sudene para o desenvolvimento do Nordeste, com base nas diretrizes políticas estabelecidas a partir de sua recriação, em 2007. Para o estudo, foi realizada uma pesquisa bibliográfica e documental para o mapeamento das diretrizes e prioridades da Sudene. Foi realizado ainda uma pesquisa de dados estatísticos secundários para analisar a evolução socioeconômica do Nordeste no período de 2007 a 2017. Para isso, foram utilizados os dados do PIB, do PIB per capita e do índice de Gini do IBGE, e do índice de desenvolvimento humano municipal da Firjan. Os resultados da pesquisa indicam que a Sudene recuperou sua importância estratégia na política do governo federal, como demonstra a instituição legal do alinhamento do Plano Regional de Desenvolvimento do Nordeste com o Plano Plurianual, com participação na elaboração do orçamento da União, a fim de garantir recursos para o desenvolvimento da região. Após uma década de sua recriação, é possível verificar avanços socioeconômicos, como o maior crescimento do PIB per capita no país, de 41,5% entre 2007 e 2015, enquanto a média nacional foi de 29%. Houve significativa melhoria no índice de desenvolvimento humano nas dimensões de saúde e de educação, acumulando no período uma variação de 55,7% e 70,5%, respectivamente. Entretanto, são ainda notáveis as limitações desse progresso, uma vez que o PIB per capita ainda é praticamente metade da média nacional e o Nordeste ainda apresenta o maior índice de desigualdade de renda do país
- …