ARSC system downtime for pacman (all)

Menu to filter items by type

Type Downtime News
Machine All Systems linuxws pacman bigdipper fish
Downtime All Future Current Past

Contents for pacman

News Items

18 May 2013 Scheduled Downtime

Last Updated: Fri, 03 May 2013 -
Machines: pacman
Start Time: 05/18/2013 -- 12:00
  End Time: 05/30/2013 -- 12:00
    Reason: University FAST power outage and OS upgrade to
            RHEL 6.4. Default modules will be updated.
            *** Recompiling of user code required. ***
            If users wish to recompile your pacman code in 
            the RHEL 6.4 environment prior to the May 18th 
            scheduled downtime, please log onto pacman14.arsc.edu 
            and recompile using the software available via 
            modules and /usr/local/pkg.  If users would like to 
            submit batch jobs in the RHEL 6.4 environment prior 
            to the scheduled downtime, please contact the ARSC Help Desk.

            *** Update on May 24, 2013 ***
            The OS upgrade is taking longer than first anticipated.
            Therefore, the pacman.arsc.edu scheduled downtime has 
            been extended through May 30th.
            Files on $CENTER and $ARCHIVE are accessible via logins
            to fish.arsc.edu. 

20 Apr 2013 Scheduled Downtime

Last Updated: Mon, 22 Apr 2013 -
Machines: pacman
Start Time: 04/20/2013 -- 11:00
  End Time: 04/20/2013 -- 12:35
    Reason: Due to a network outage, many 4 core nodes were rebooted. 

11 Apr 2013 Unscheduled Downtime

Last Updated: Fri, 12 Apr 2013 -
Machines: pacman
Start Time: 04/06/2013 -- 00:09
  End Time: 04/11/2013 -- 00:09
    Reason: The reoccuring issue with the pacman batch
            scheduler has been resolved.  Users who lost jobs as 
            a result of the 12:09am batch scheduler failure on
            April 6th, 7th, and 8th were notified.  
            All previously held long running jobs have been released 
            and all pacman queues are now functioning normally.

06 Apr 2013 Unscheduled Downtime

Last Updated: Sat, 06 Apr 2013 -
Machines: pacman
Start Time: 04/06/2013 -- 00:00
  End Time: 04/06/2013 -- 14:00
    Reason: There was an issue with the administrative node on pacman which 
            resulted in jobs on 12 core, 16 core and bigmem nodes failing.

            Jobs running on 4 core nodes should not have been affected by this
            outage.

27 Mar 2013 Unscheduled Downtime

Last Updated: Wed, 27 Mar 2013 -
Machines: linuxws pacman bigdipper fish
Start Time: 03/27/2013 -- 15:30
  End Time: 03/28/2013 -- 12:55
    Reason: Power was lost to the machine room.  An emergency power down was
            initiated on pacman, fish, bigdipper.   All running jobs were lost.

            Users with running job at the time of the power outage will be 
            contacted with a list of lost jobs.

            03/27/2013 -- 20:30 - Linux Workstations were returned to service

            03/27/2013 -- 23:00 - Fish was returned to service
 
            03/28/2013 -- 00:00 - Pacman was returned to service - 

            03/28/2013 -- 09:30 - Web Servers and the License server are still   
            being recovered.

            03/28/2013 -- 11:30 - Web Servers have been returned to service.
            
            03/28/2013 -- 12:55 - The license server was returned to service.