96 research outputs found

    The Civil War Letters of Jeremiah Mickly of Franklin Township, Adams County

    Full text link
    On December 2, 1862, just eleven days before the Battle of Fredericksburg, Virginia, Jeremiah Mickly said goodbye to his wife and two children and reported for duty with the 177th Pennsylvania Infantry to become a Civil War chaplain. The only known photograph ofMickly shows him dressed in the standard chaplain\u27s uniform of the day: a plain black frock coat with a standing collar and black buttons with plain black pantaloons. Like many other Civil War soldiers, Mickly re-enlisted for service after his stint with the 177th ended, becoming chaplain of the 43rd Regiment, United States Colored Troops. Impressed with the educational progress and courage of the black soldiers he served with, Mickly wrote a history of the 43rd Regiment. The 88-page booklet was published in 1866 in Gettysburg by J. E. Wible, Printer. Mickly\u27s book and correspondence prove that his Civil War experience shaped his belief that black people are entitled to equal rights. [excerpt

    PTab: Using the Pre-trained Language Model for Modeling Tabular Data

    Full text link
    Tabular data is the foundation of the information age and has been extensively studied. Recent studies show that neural-based models are effective in learning contextual representation for tabular data. The learning of an effective contextual representation requires meaningful features and a large amount of data. However, current methods often fail to properly learn a contextual representation from the features without semantic information. In addition, it's intractable to enlarge the training set through mixed tabular datasets due to the difference between datasets. To address these problems, we propose a novel framework PTab, using the Pre-trained language model to model Tabular data. PTab learns a contextual representation of tabular data through a three-stage processing: Modality Transformation(MT), Masked-Language Fine-tuning(MF), and Classification Fine-tuning(CF). We initialize our model with a pre-trained Model (PTM) which contains semantic information learned from the large-scale language data. Consequently, contextual representation can be learned effectively during the fine-tuning stages. In addition, we can naturally mix the textualized tabular data to enlarge the training set to further improve representation learning. We evaluate PTab on eight popular tabular classification datasets. Experimental results show that our method has achieved a better average AUC score in supervised settings compared to the state-of-the-art baselines(e.g. XGBoost), and outperforms counterpart methods under semi-supervised settings. We present visualization results that show PTab has well instance-based interpretability

    Computationally Efficient Confidence Intervals for Cross-validated Area Under the ROC Curve Estimates

    Get PDF
    In binary classification problems, the area under the ROC curve (AUC), is an effective means of measuring the performance of your model. Most often, cross-validation is also used, in order to assess how the results will generalize to an independent data set. In order to evaluate the quality of an estimate for cross-validated AUC, we must obtain an estimate for its variance. For massive data sets, the process of generating a single performance estimate can be computationally expensive. Additionally, when using a complex prediction method, calculating the cross-validated AUC on even a relatively small data set can still require a large amount of computation time. Thus, when the processes of obtaining a single estimate for cross-validated AUC is significant, the bootstrap, as a means of variance estimation, can be computationally intractable. As an alternative to the bootstrap, we demonstrate a computationally efficient influence curve based approach to obtaining a variance estimate for cross-validated AUC
    • …
    corecore