5 research outputs found
Recommended from our members
Deploying perfSONAR-based End-2-End Monitoring for Production US CMS Networking
Fermilab is the US Tier-1 Center for CMS data storage and analysis. End-2-End (E2E) circuits are utilized to support high impact data movement into and out of the Tier-1 Center. E2E circuits have been implemented to facilitate the movement of raw experiment data from the Tier-0 Center at CERN, as well as processed data to a number of the US Tier-2 sites. Troubleshooting and monitoring of those circuits presents a significant challenge, since the circuits typically cross multiple research & education networks, each with its own management domain and customized monitoring capabilities. The perfSONAR Monitoring Project was established to facilitate development and deployment of a common monitoring infrastructure across multiple network management domains. Fermilab has deployed perfSONAR across its E2E circuit infrastructure and enhanced the product with several tools that ease the monitoring and management of those circuits. This paper will present the current state of perfSONAR monitoring at Fermilab and detail our experiences using perfSONAR to manage our current E2E circuit infrastructure. We will describe how production network circuits are monitored by perfSONAR E2E Monitoring Points (MPs), and the benefits it has brought to production US CMS networking support
Recommended from our members
Metropolitan Area Network Support at Fermilab
Advances in wide area network service offerings, coupled with comparable developments in local area network technology have enabled many research sites to keep their offsite network bandwidth ahead of demand. For most sites, the more difficult and costly aspect of increasing wide area network capacity is the local loop, which connects the facility LAN to the wide area service provider(s). Fermilab, in coordination with neighboring Argonne National Laboratory, has chosen to provide its own local loop access through leasing of dark fiber to nearby network exchange points, and procuring dense wave division multiplexing (DWDM) equipment to provide data channels across those fibers. Installing and managing such optical network infrastructure has broadened the Laboratory's network support responsibilities to include operating network equipment that is located off-site, and is technically much different than classic LAN network equipment. Effectively, the Laboratory has assumed the role of a local service provider. This paper will cover Fermilab's experiences with deploying and supporting a Metropolitan Area Network (MAN) infrastructure to satisfy its offsite networking needs. The benefits and drawbacks of providing and supporting such a service will be discussed
Recommended from our members
End-to-End Network/Application Performance Troubleshooting Methodology
The computing models for HEP experiments are globally distributed and grid-based. Obstacles to good network performance arise from many causes and can be a major impediment to the success of the computing models for HEP experiments. Factors that affect overall network/application performance exist on the hosts themselves (application software, operating system, hardware), in the local area networks that support the end systems, and within the wide area networks. Since the computer and network systems are globally distributed, it can be very difficult to locate and identify the factors that are hurting application performance. In this paper, we present an end-to-end network/application performance troubleshooting methodology developed and in use at Fermilab. The core of our approach is to narrow down the problem scope with a divide and conquer strategy. The overall complex problem is split into two distinct sub-problems: host diagnosis and tuning, and network path analysis. After satisfactorily evaluating, and if necessary resolving, each sub-problem, we conduct end-to-end performance analysis and diagnosis. The paper will discuss tools we use as part of the methodology. The long term objective of the effort is to enable site administrators and end users to conduct much of the troubleshooting themselves, before (or instead of) calling upon network and operating system 'wizards,' who are always in short supply
Recommended from our members
Lambda Station: Alternate Network Path Forwarding for Production SciDAC Applications
The LHC era will start very soon, creating immense data volumes capable of demanding allocation of an entire network circuit for task-driven applications. Circuit-based alternate network paths are one solution to meeting the LHC high bandwidth network requirements. The Lambda Station project is aimed at addressing growing requirements for dynamic allocation of alternate network paths. Lambda Station facilitates the rerouting of designated traffic through site LAN infrastructure onto so-called 'high-impact' wide-area networks. The prototype Lambda Station developed with Service Oriented Architecture (SOA) approach in mind will be presented. Lambda Station has been successfully integrated into the production version of the Storage Resource Manager (SRM), and deployed at US CMS Tier1 center at Fermilab, as well as at US-CMS Tier-2 site at Caltech. This paper will discuss experiences using the prototype system with production SciDAC applications for data movement between Fermilab and Caltech. The architecture and design principles of the production version Lambda Station software, currently being implemented as Java based web services, will also be presented in this paper