180 research outputs found
Synthesis of well-defined catechol polymers for surface functionalization of magnetic nanoparticles
In order to obtain dual-modal fluorescent magnetic nanoparticles, well-defined fluorescent functional polymers with terminal catechol groups were synthesized by single electron transfer living radical polymerization (SET-LRP) under aqueous conditions for “grafting to” modification of iron oxide nanoparticles. Acrylamide, N-isopropylacrylamide, poly(ethylene glycol) methyl ether acrylate, 2-hydroxyethyl acrylate, glycomonomer and rhodamine B piperazine acrylamide were homo-polymerized or block-copolymerized directly from an unprotected dopamine-functionalized initiator in an ice-water bath. The Cu-LRP tolerated the presence of catechol groups leading to polymers with narrow molecular weight distributions (Mw/Mn < 1.2) and high or full conversion obtained in a few minutes. Subsequent immobilization of dopamine-terminal copolymers on an iron oxide surface were successful as demonstrated by Fourier transform infrared spectroscopy (FTIR), dynamic light scattering (DLS), transition electron microscopy (TEM) and thermogravimetric analysis (TGA), generating stable polymer-coated fluorescent magnetic nanoparticles. The nanoparticles coated with hydrophilic polymers showed no significant cytotoxicity when compared with unmodified particles and the cellular-uptake of fluorescent nanoparticles by A549 cells was very efficient, which also indicated the potential application of these advanced nano materials for bio-imaging
Measuring Firm Size in Empirical Corporate Finance
In empirical corporate finance, firm size is commonly used as an important, fundamental firm characteristic. However, no research comprehensively assesses the sensitivity of empirical results in corporate finance to different measures of firm size. This paper fills this hole by providing empirical evidence for a “measurement effect” in the “size effect”. In particular, we examine the influences of employing different proxies (total assets, total sales, and market capitalization) of firm size in 20 prominent areas in empirical corporate finance research. We highlight several empirical implications. First, in most areas of corporate finance the coefficients of firm size measures are robust in sign and statistical significance. Second, the coefficients on regressors other than firm size often change sign and significance when different size measures are used. Unfortunately, this suggests that some previous studies are not robust to different firm size proxies. Third, the goodness of fit measured by R-squared also varies with different size measures, suggesting that some measures are more relevant than others in different situations. Fourth, different proxies capture different aspects of “firm size”, and thus have different implications in corporate finance. Therefore, the choice of size measures needs both theoretical and empirical justification. Finally, our empirical assessment provides guidance to empirical corporate finance researchers who must use firm size measures in their work
Datasets for Large Language Models: A Comprehensive Survey
This paper embarks on an exploration into the Large Language Model (LLM)
datasets, which play a crucial role in the remarkable advancements of LLMs. The
datasets serve as the foundational infrastructure analogous to a root system
that sustains and nurtures the development of LLMs. Consequently, examination
of these datasets emerges as a critical topic in research. In order to address
the current lack of a comprehensive overview and thorough analysis of LLM
datasets, and to gain insights into their current status and future trends,
this survey consolidates and categorizes the fundamental aspects of LLM
datasets from five perspectives: (1) Pre-training Corpora; (2) Instruction
Fine-tuning Datasets; (3) Preference Datasets; (4) Evaluation Datasets; (5)
Traditional Natural Language Processing (NLP) Datasets. The survey sheds light
on the prevailing challenges and points out potential avenues for future
investigation. Additionally, a comprehensive review of the existing available
dataset resources is also provided, including statistics from 444 datasets,
covering 8 language categories and spanning 32 domains. Information from 20
dimensions is incorporated into the dataset statistics. The total data size
surveyed surpasses 774.5 TB for pre-training corpora and 700M instances for
other datasets. We aim to present the entire landscape of LLM text datasets,
serving as a comprehensive reference for researchers in this field and
contributing to future studies. Related resources are available at:
https://github.com/lmmlzn/Awesome-LLMs-Datasets.Comment: 181 pages, 21 figure
UPOCR: Towards Unified Pixel-Level OCR Interface
In recent years, the optical character recognition (OCR) field has been
proliferating with plentiful cutting-edge approaches for a wide spectrum of
tasks. However, these approaches are task-specifically designed with divergent
paradigms, architectures, and training strategies, which significantly
increases the complexity of research and maintenance and hinders the fast
deployment in applications. To this end, we propose UPOCR, a
simple-yet-effective generalist model for Unified Pixel-level OCR interface.
Specifically, the UPOCR unifies the paradigm of diverse OCR tasks as
image-to-image transformation and the architecture as a vision Transformer
(ViT)-based encoder-decoder. Learnable task prompts are introduced to push the
general feature representations extracted by the encoder toward task-specific
spaces, endowing the decoder with task awareness. Moreover, the model training
is uniformly aimed at minimizing the discrepancy between the generated and
ground-truth images regardless of the inhomogeneity among tasks. Experiments
are conducted on three pixel-level OCR tasks including text removal, text
segmentation, and tampered text detection. Without bells and whistles, the
experimental results showcase that the proposed method can simultaneously
achieve state-of-the-art performance on three tasks with a unified single
model, which provides valuable strategies and insights for future research on
generalist OCR models. Code will be publicly available
Research on identity authentication methods for IoT devices in smart tourism
The internet of things (IoT) is a key trend in smart tourism, involving multiple stakeholders like government management, public cloud platforms, device manufacturers, scenic areas, and tourists. IoT devices, often deployed in public spaces, are vulnerable to physical attacks, making identity authentication critical for security. A certificate-free identity authentication method based on administrative applications was proposed, using MQTT protocol message queues to maintain device security status, addressing issues with low-power devices in sleep mode. Based on national cryptographic algorithms, secure and controllable IoT information was ensured. Performance evaluations show that it effectively helps prevent security threats, achieving an average authentication accuracy of 99.7%, with embedded RAM and FLASH usage not exceeding 35 KB and 30 KB, suitable for smart tourism applications
Research on side-channel attacks and defense methods for IoT devices
Internet of things (IoT) devices are typically implemented using microcontrollers with limited computational capabilities, which necessitate the use of lightweight symmetric encryption algorithms to ensure data security. Due to their inherent characteristics, these devices can only be deployed in open environments, making them highly vulnerable to side-channel attacks. To address this issue, experiments were conducted on a self-designed side-channel attack validation platform, where a secure key management scheme and an improved S-box design were proposed as countermeasures against side-channel attacks. The validation platform consisted of a two-stage differential amplifier and an anti-interference finite impulse response (FIR) filter, which were capable of capturing subtle power consumption fluctuations. A two-round correlated energy attack targeting lightweight encryption algorithms was also designed. By evaluating the confidence of the correct key correlation coefficient, after 10 000 attacks on 3 000 power consumption traces of the PRESENT algorithm, a success rate of over 96% is achieved, with the mean correlation of the correct key exceeding 0.6. At a 95% confidence level, a narrow confidence interval is obtained. In contrast, when the improved algorithm is used in the same experiment, the attack success rate is only 9.12%
A brief exploration of the physical properties of single living cells under dynamic loading conditions
Introduction:Single living cells exhibit both active biological functions and material-like mechanical behaviors. While extensive research has focused on static or quasi-static loading, the purely mechanical properties under high-rate impact remain underexplored. Investigating cell responses to dynamic loading can isolate rapid deformation characteristics, potentially clarifying how life activities modulate mechanical behavior.Methods:We developed a custom dynamic loading system to expose single adherent macrophage cells to transient compression–shear stresses in a controlled fluid environment. A Polymethyl Methacrylate chamber housed the cells, and impact pressures (156.48–3603.85 kPa) were measured in real time using a high-frequency sensor. High-speed imaging (up to 2×105 fps) captured cellular area changes, providing insight into global deformation. In total, 198 valid experiments were performed, and statistical tests confirmed that initial perimeter and area followed normal-like distributions suitable for theoretical analysis.Results:Cells demonstrated a two-stage expansion under shock loading. At lower pressures, cytoplasmic regions rapidly spread into the focal plane, producing significant increases in projected area. As pressure rose further, deformation rate decreased, reflecting the constraining influence of the nucleus. By analyzing the final-to-initial area ratios across various pressures and initial cell sizes, we derived an incomplete state equation akin to Tait-like or Birch–Murnaghan models, indicating an inflection point of maximum deformation rate.Discussion:These findings highlight that fast impact loading effectively minimizes confounding biological processes, revealing intrinsic mechanical responses. The proposed state equation captures cell behavior within milliseconds, offering a path to integrate dynamic results with slower, life-activity-driven adaptations, and laying groundwork for more comprehensive biomechanical models of living cells
Highly enhanced thermoelectric and mechanical performance of copper sulfides via natural mineral in-situ phase separation
In situ phase separation precipitates play an important role in enhancing the thermoelectric properties of copper sulfides by suppressing phonon transmission. In this study, Cu1.8S composites were fabricated by melting reactions and spark plasma sintering. The complex structures, namely, micron-PbS, Sb2S3, nano-FeS, and multiscale pores, originate from the introduction of FePb4Sb6S14 into the Cu1.8S matrix. Using effective element (Fe) doping and multiscale precipitates, the Cu1.8S+0.5 wt% FePb4Sb6S14 bulk composite reached a high dimensionless figure of merit (ZT) value of 1.1 at 773 K. Furthermore, the modulus obtained for this sample was approximately 40.27 GPa, which was higher than that of the pristine sample. This study provides a novel strategy for realizing heterovalent doping while forming various precipitates via in situ phase separation by natural minerals, which has been proven to be effective in improving the thermoelectric and mechanical performance of copper sulfides and is worth promoting in other thermoelectric systems
MedShapeNet – a large-scale dataset of 3D medical shapes for computer vision
Objectives: The shape is commonly used to describe the objects. State-of-the-art algorithms in medical imaging are predominantly diverging from computer vision, where voxel
grids, meshes, point clouds, and implicit surfacemodels are used. This is seen from the growing popularity of ShapeNet (51,300 models) and Princeton ModelNet (127,915 models). However, a large collection of anatomical shapes (e.g., bones, organs, vessels) and 3D models of surgical instruments is missing. Methods: We present MedShapeNet to translate datadriven vision algorithms to medical applications and to adapt state-of-the-art vision algorithms to medical problems. As a unique feature, we directly model the majority of
shapes on the imaging data of real patients. We present use cases in classifying brain tumors, skull reconstructions, multi-class anatomy completion, education, and 3D printing. Results: By now, MedShapeNet includes 23 datasets with more than 100,000 shapes that are paired with annotations (ground truth). Our data is freely accessible via aweb interface and a Python application programming interface and can be used for discriminative, reconstructive, and variational benchmarks as well as various applications
in virtual, augmented, or mixed reality, and 3D printing. Conclusions: MedShapeNet contains medical shapes from anatomy and surgical instruments and will continue to collect data for benchmarks and applications. The project page is: https://medshapenet.ikim.nrw/
MedShapeNet -- A Large-Scale Dataset of 3D Medical Shapes for Computer Vision
Prior to the deep learning era, shape was commonly used to describe the
objects. Nowadays, state-of-the-art (SOTA) algorithms in medical imaging are
predominantly diverging from computer vision, where voxel grids, meshes, point
clouds, and implicit surface models are used. This is seen from numerous
shape-related publications in premier vision conferences as well as the growing
popularity of ShapeNet (about 51,300 models) and Princeton ModelNet (127,915
models). For the medical domain, we present a large collection of anatomical
shapes (e.g., bones, organs, vessels) and 3D models of surgical instrument,
called MedShapeNet, created to facilitate the translation of data-driven vision
algorithms to medical applications and to adapt SOTA vision algorithms to
medical problems. As a unique feature, we directly model the majority of shapes
on the imaging data of real patients. As of today, MedShapeNet includes 23
dataset with more than 100,000 shapes that are paired with annotations (ground
truth). Our data is freely accessible via a web interface and a Python
application programming interface (API) and can be used for discriminative,
reconstructive, and variational benchmarks as well as various applications in
virtual, augmented, or mixed reality, and 3D printing. Exemplary, we present
use cases in the fields of classification of brain tumors, facial and skull
reconstructions, multi-class anatomy completion, education, and 3D printing. In
future, we will extend the data and improve the interfaces. The project pages
are: https://medshapenet.ikim.nrw/ and
https://github.com/Jianningli/medshapenet-feedbackComment: 16 page
- …
