HLRN System - News Archive 2014


IDPublished Subject
[2728] Dec 29, 2014Konrad: Maintenance on Tuesday, Dec 30
[2725] Dec 26, 2014Security Alert at HLRN
[2722] Dec 22, 2014Display of the home quota limits
[2720] Dec 19, 2014Berlin complex Konrad relaunch
[2715] Dec 18, 2014General Access to HLRN-III Hannover
[2711] Dec 08, 2014HLRN-III Hannover declares Operational Readiness on Dec 8, 2014
[2710] Dec 04, 2014Update: Berlin complex Konrad down
[2693] Oct 02, 2014Status information on the extension of the Berlin complex Konrad
[2690] Sep 18, 2014PERM Archive and Data Management at HLRN/Hannover
[2688] Sep 09, 2014Hannover HLRN Upgrade and PERM Maintenance
[2686] Sep 04, 2014Resolved: Lustre problems in Hannover
[2684] Sep 04, 2014Lustre problems in Hannover - system closed for user access
[2682] Sep 01, 2014XC30 (MPP1) in Hannover available again
[2680] Sep 01, 2014XC30 (MPP1) in Hannover closed
[2675] Aug 21, 2014PERM Hannover remains unavailable
[2672] Aug 19, 2014Maintenance of $PERM in Hannover
[2667] Aug 15, 2014Extended Downtime in Berlin for phase 2 upgrade of Konrad starts September 3, 2014
[2666] Aug 15, 2014Berlin: Maintenance day at ZIB on August 26, 2014
[2663] Aug 01, 2014Announcement: EMEA IPCC Meeting 9.-10. Sept. 2014 at ZIB
[2661] Aug 01, 2014Scheduled Maintenance Complex Hannover Wed Aug 06., starting 08:30
[2657] Jul 28, 2014Resolved - Gottfried: Production on the Cray XC30 in Hannover stopped
[2656] Jul 27, 2014Batch system: Change for small jobs in Berlin
[2655] Jul 27, 2014Batch system: Requesting hardware by feature specification
[2650] Jul 24, 2014Konrad available again; Lustre problems in Berlin solved
[2649] Jul 21, 2014Konrad: Recurrent Lustre problems in Berlin - system closed for user access
[2643] Jul 16, 2014Resolved: Konrad: Lustre problems in Berlin
[2642] Jul 14, 2014Maintenance of the Archive System ($PERM) in Hannover extended
[2637] Jul 10, 2014Maintenance of the Archive System ($PERM) in Hannover preponed to 18:00 today, Thursday 10. 7. 2014
[2634] Jul 10, 2014Maintenance of the Archive System ($PERM) in Hannover
[2628] Jul 08, 2014Apply for Computing Time by July 28th, 2014
[2632] Jul 07, 2014Preannouncement: Extended downtimes in Berlin and Hannover in September and October
[2625] Jun 27, 2014Berlin: Server maintenance at ZIB
[2619] Jun 24, 2014Finished: Maintenance of the Archive (PERM) System in Hannover
[2616] Jun 17, 2014Newsletter: Sixteenth edition of HLRN Informationen published
[2613] Jun 16, 2014Finished: Maintenance of Archive (PERM) System in Hannover
[2602] Jun 13, 2014Finished: Installation works in Hannover
[2600] Jun 06, 2014Maintenance for Archive (PERM) System in Hannover
[2597] Jun 05, 2014Installation works in Hannover next week June 11. / 12.
[2594] Jun 05, 2014Update: PERM in Hannover available again
[2588] May 21, 2014Update: PERM in Hannover unavailable
[2585] May 08, 2014Maintenance of $PERM in Hannover
[2583] May 05, 2014Verwaltungsrat passed Entgeltordnung
[2575] Apr 28, 2014Konrad: Network failure over the weekend
[2573] Apr 24, 2014Archive servers in Berlin unavailable on April 29, 2014
[2569] Apr 22, 2014Finished - Maintenance on archive system in Hannover
[2567] Apr 10, 2014Maintenance on SMP system in Hannover finished
[2565] Apr 07, 2014Apply for Computing Time by April 28th, 2014
[2562] Apr 07, 2014Maintenance on SMP system in Hannover
[2555] Mar 20, 2014PERM Maintenance in Hannover Wed Mar 26 09:00 a.m.
[2553] Mar 19, 2014Finished: Maintenance in Hannover
[2549] Mar 13, 2014Maintenance in Hannover
[2547] Mar 11, 2014Solved: access to PERM in Hannover
[2544] Mar 10, 2014Access to PERM in Hannover with problems
[2542] Feb 27, 2014Detailed job accounting information available
[2537] Feb 24, 2014Maintenance of PERM in Hannover finished
[2534] Feb 21, 2014Maintenance of $PERM in Hannover
[2531] Feb 20, 2014Berlin Maintenance on Tue, Feb 25, 2014
[2527] Feb 14, 2014Finished: Maintenance of $PERM in Hannover
[2525] Feb 13, 2014Maintenance of $PERM in Hannover
[2520] Feb 06, 2014Berlin Maintenance on Tue, Feb. 11, 2014
[2518] Feb 05, 2014Finished: Hannover Maintenance on Tue, Feb. 4th
[2508] Jan 29, 2014HLRN Hannover available again
[2505] Jan 15, 2014HLRN-III Hannover - Lustre Problem solved
[2503] Jan 15, 2014HLRN-III - Lustre problem in Hannover
[2502] Jan 13, 2014NPL accounting active since January 1st, 2014
[2494] Jan 10, 2014Apply for Computing Time by January 28th, 2014
[2497] Jan 06, 2014Hannover: End of HLRN-II Operation January 17th, 2014
[2492] Jan 02, 2014Berlin Maintenance on Tue, Jan. 7th **extended**



Konrad: Maintenance on Tuesday, Dec 30
[2728] Dec 29, 2014

System maintenance will be performed on the Berlin complex Konrad on Tuesday, December 30, 2014, from 9:00 until afternoon. Konrad will be unavailable during this time.

All login sessions will be terminated and the batch system will be stopped at 9:00 on Tuesday.

We will post a note when Konrad is back online again.

(wwb)


Security Alert at HLRN
[2725] Dec 26, 2014

Dear HLRN User,

we have confirmation that there is a new, serious security alert that affects publicly accessible services of HLRN. Despite Christmas and the acceptance tests Cray is already working to implement a fix. Access to HLRN in Hannover is closed until that fix has been applied.

I regret having to take these measures but I trust you agree with me that a few more days of patience is better than the risk of another month downtime plus hundreds of new passwords issued all over again. Recent events show that those risks are real.

IT security in todays world has risen to become a major factor in the operation of a large, complex and expensive resource as our HLRN.

We expect to be back online with HLRN / Hannover before Jan 1, 2015. Already submitted jobs will run and terminate normally.

Other than that: Merry Christmas!

Regards, Steffen Schulze-Kremer



Display of the home quota limits
[2722] Dec 22, 2014

After the reinstallation, the way users can display their home quota limits has changed. Please use the command "home_quota" now.

Please keep in mind that the values are recalculated only every 5 minutes.

(hs)


Berlin complex Konrad relaunch
[2720] Dec 19, 2014

In November the HLRN system was closed due to a security incident. As of today, December 19, 2014, the HLRN complex Konrad is open again for user access. For more details and important information regarding access (login), user data, and usage of the new system, please visit our topic HLRN Relaunch 2014.

(ml/wwb)


General Access to HLRN-III Hannover
[2715] Dec 18, 2014

Dear HLRN User,

although much more work is needed before new and old features
are all in place, we open HLRN-III in Hannover today for general
user access during the acceptance test phase. Our main intention
is to provide you access to your data and to get your input in
order to test the system and to identify issues that might be
relevant to the acceptance procedure.

Only few software packages have been installed so far. We do not
claim to be ready for production but are working on it. Suggestions
are welcome. There will be no accounting of resources until the
end of the acceptance phase.

For HLRN-III / Berlin a separate announcement will follow.

Regards, Steffen Schulze-Kremer

Known major issues:
- no CCM mode yet
- no access to /perm yet (though archived data is intact)
- login in Hannover only via hlogin[1|2|3]
- user account management restricted
- some software license keys are not yet accessible
- software has to be recompiled for new Haswell partition
- three /work user directories are still being restored (users will be notified)
- operation may be briefly interrupted for maintenance
- limited support between Christmas and New Year



HLRN-III Hannover declares Operational Readiness on Dec 8, 2014
[2711] Dec 08, 2014

Dear HLRN User,

today Cray finishes re-installation of the HLRN Hannover site and will continue with the acceptance tests. Until the end of this week performance tests should be completed. For that we will soon ask members of our benchmark team to log into HLRN-III and verify the results.

We plan to open HLRN-III at Hannover for general user access on Monday, December 15, 2014 (no NPL accounting and still in the acceptance test mode).

User data in /home has been reverted to the state of October 12, well before the attack. User data of /work is being restored from backup of October 31, 2014. If you seriously need files other than those mentioned, please contact our staff. The software of /sw has also been reverted to the state of October 12, 2014.

So, hopefully beginning of next week you will be able to continue work on HLRN-III Hannover where you left off. A number of improvements have been made: installation of new, fast hardware for MPP and SMP; additional security measures; new features for analysing and managing energy efficiency of jobs; and more. We will provide more details on these and other features soon.

Keep in mind that until the end of this year we are still in acceptance test mode. That means, there may be interruptions of operation on short notice. - If you experience problems of any kind, kindly let us know as soon as possible.

Regards, Steffen Schulze-Kremer




Update: Berlin complex Konrad down
[2710] Dec 04, 2014

HLRN Berlin complex Konrad was subject of a security incident. Both
the full compute system infrastructure as well as the HLRN service
portal remain closed.

Update November 27, 2014: We expect general availability of the HLRN system for mid-December.

(ml/gla/ct/ts/wwb)


Status information on the extension of the Berlin complex Konrad
[2693] Oct 02, 2014

The HLRN-III system in Berlin, Konrad, was successfully extended and upgraded by Cray in the last weeks. Currently, the performance of the system is tuned by Cray and its benchmarkers.

Unfortunately, the performance tests take longer than planned and the acceptance phase with (limited) user access, which was planned for the beginning of October, is postponed.

We expect to have more detailed information within the next week.

Please also see the phase 2 upgrade plan.

(gla)


PERM Archive and Data Management at HLRN/Hannover
[2690] Sep 18, 2014

Dear HLRN-User,

regrettably Cray is still busy debugging the PERM archive at HLRN/RRZN. Data that in the past has been written to PERM is safe and will become accessible again once PERM is back online. However, it appears that PERM will not become available for user access before shutdown of HLRN on Sept 23rd.

Since this conflicts with your ability to archive data, contrary to our general policy of not backing up the WORK filesystem, we are taking extra measures to ensure that data on HOME, WORK and, of course, PERM will survive the HLRN upgrade intact.

In case you urgently need data from PERM before shutdown please contact our staff and we will retrieve that data for you. We expect to be online with the new system and your present data for testing by the end of October 2014 and to open the system for general user access by the end of November 2014.

We apologize for all the inconveniences.

Sincerely, Steffen Schulze-Kremer




Hannover HLRN Upgrade and PERM Maintenance
[2688] Sep 09, 2014

In preparation of the phase 2 installation of HLRN-III at RRZN, user
operation will be terminated on September, Tue 23rd, 12:00h. We
expect to be online with the new system for testing by the end of
October 2014 and to open the system for general user access by the
end of November 2014. Accounting is likely to be re-activated from
January 2015 on.

Regarding PERM in Hannover. Due to installation problems, access to
the PERM archive system in Hannover is still suspended. We expect to
receive a new software update tomorrow (September 10th, 2014) that
should fix those problems. Unless our tests show further errors we
would reopen PERM later this week. Until then, we try to support
users with urgent requests for retrieving or storing data to PERM on
an individual basis. Please contact our staff if you need assistance.


Data in the WORK file system should survive the HLRN-III upgrade,
but, as has been said before, there is no guarantee and no backup of
data in the WORK file system. However, you could copy limited amounts
of sensitive data to your HOME directory as a precaution.

We apologize for all the inconveniences.

Sincerely, Steffen Schulze-Kremer



Resolved: Lustre problems in Hannover
[2686] Sep 04, 2014

The lustre ($WORK) filesystem in Hannover is available again.

Thank you for your patience.

(HS)


Lustre problems in Hannover - system closed for user access
[2684] Sep 04, 2014

Due to a Lustre ($WORK) problem the HLRN complex in Hannover is closed for user access.

We will post a message when the problem is solved and the system is open again for user access.

We apologise for the inconvenience.

(HS)


XC30 (MPP1) in Hannover available again
[2682] Sep 01, 2014

The XC30 in Hannover is fully available again.

Thank you for your patience.

(HS)





XC30 (MPP1) in Hannover closed
[2680] Sep 01, 2014

due to MPI network problems the MPP1 queue in Hannover
is closed. The SMP queue is still available.

We apologise for the inconvenience.

(HS)



PERM Hannover remains unavailable
[2675] Aug 21, 2014

Due to an unexpected connection problem, PERM Hannover will remain unavailable for the time being. We are working on the problem. A message will be posted when PERM Hannover is available again.

(GG)


The original message was:
-------------------------
On Wednesday, the 20th of August, starting at 14:00 the $PERM
filesystem in Hannover will be unaccessible due to maintenance.



Maintenance of $PERM in Hannover
[2672] Aug 19, 2014

On Wednesday, the 20th of August, starting at 14:00 the $PERM
filesystem in Hannover will be unaccessible due to maintenance.

We apologise for the inconvenience.

(HS)



Extended Downtime in Berlin for phase 2 upgrade of Konrad starts September 3, 2014
[2667] Aug 15, 2014

The downtime for the upgrade and the extension of the Berlin complex Konrad is scheduled from September 3, 2014, 7:30 until at least September 24, 2014 with no user access to data (HOME, WORK, PERM) in Berlin.

After this upgrade we will start the acceptance phase with limited user access. Full user service is expected to resume on Konrad as soon as possible and no later than November 1, 2014.

For a more detailed time schedule please see our phase 2 upgrade plan.

The downtime of complex Gottfried in Hannover will start on September 23, 2014. A detailed schedule will be announced separately.

Important note:

Protect your important data. WORK (on the Lustre file system) is to be considered as a scratch file system. There is no backup available for WORK (see also our topic on data management). Please save important data to tapes on the archive file system PERM (see " Data Management").


Please plan accordingly.

(wwb)


Berlin: Maintenance day at ZIB on August 26, 2014
[2666] Aug 15, 2014

The annual infrastructure maintenance will be performed at ZIB on Tuesday, August 26, from 7:00 until approximately 18:00. Some HLRN services will be partially or fully unavailable for ALL users during that time:
  • At 6:00 shutdown of Konrad. All login sessions will be terminated.
  • Between 8:00 and approx. 17:00:
    • ZIB may be partially or entirely cut off from electric power and from the internet.
    • ZIB personnel can be contacted by phone, only.
    • Email traffic to/from ZIB will be interrupted, but mail will not get lost.
    • The HLRN Service portal including the user and project database server (zulassung.hlrn.de) will be unavailable.
    • The licence server for the packages ABAQUS, ANSYS, Fluent, and Totalview (all licensed by ZIB) will be unavailable.
  • Approximately 18:00: All HLRN services are expected to be restored.
The HLRN web and mail servers will remain available during the maintenance.

Please plan accordingly.

(wwb)



Announcement: EMEA IPCC Meeting 9.-10. Sept. 2014 at ZIB
[2663] Aug 01, 2014

Dear colleagues,

for those of you who are interested in developments using the Intel MIC (Xeon Phi) architecture, we will bring a conference announcement to your attention:

There are only six weeks to go before the EMEA IPCC User Forum Conference 2014 in Berlin. If you are intending to attend and have not yet registered, we?d like to encourage you to go to the registration page and sign up.

On the registration page, you?ll also find:
  • An updated agenda
  • Instructions on how you can reserve a room in the hotel we are recommending


Regards
Stephen Gillich (Intel)
Stephen Blair-Chappell (Intel)
Thomas Steinke (ZIB)


Scheduled Maintenance Complex Hannover Wed Aug 06., starting 08:30
[2661] Aug 01, 2014

HLRN complex Hannover will undergo scheduled maintenance on Wednesday August 6th. starting 08:30 a.m.

At this time, we will close the site and terminate login sessions without further warning. New jobs in Hannover will only start if they can finish before Wed 08:30. A message will be posted when the system is open again - we expect the maintenance to take approximately one day.

The site in Berlin should not be affected.


(GG)



Resolved - Gottfried: Production on the Cray XC30 in Hannover stopped
[2657] Jul 28, 2014

The XC30 system in Hannover (MPP1 partition) is available
again for user access.

Thank you for your patience.

(HS)



Batch system: Change for small jobs in Berlin
[2656] Jul 27, 2014

Starting from Tuesday, 29 July, 9:00, on, the MPP1 production class "mpp1q" will be adjusted to the one in Hannover and will only accept jobs with a minimum job size of 4 nodes.

For small jobs with a job size of less than 4 nodes the new class "mpp1smallq" has been configured in Berlin.

Please submit your small jobs to this new class from now on, if you still prefer to submit jobs to a queue/class directly.

If you do not specify a destination queue/class for your jobs (msub option -q or "#PBS -q" in your job script) you do not need to change anything. Your jobs will be automatically mapped to the corresponding job class by the batch system.

In case your job profile contains a lot of small MPP jobs, you might contact your support consultant as HLRN may limit or completely disable those jobs in future.

(gla)



Batch system: Requesting hardware by feature specification
[2655] Jul 27, 2014

In batch jobs, the necessary hardware architecture (MPP, SMP, data or postprocessing node) can now be requested by a feature specification (without specifying a destination class/queue).

Also, the Cluster Compatibility Mode (CCM) on the XC30 and the Nvidia K40 GPUs on the SMP nodes can be requested by a feature resource.

Alternatively, the corresponding class may still be requested, but this is not recommended, as the classes and their limits might be changed on short notice for job scheduling adjustments.

Fur further details, please see the batch system documentation. For examples, see the quickstart guide.

(gla)



Konrad available again; Lustre problems in Berlin solved
[2650] Jul 24, 2014

Since Thursday, July 24, 14:30, Berlin HLRN complex "Konrad" is fully available again for user access.

During the last days the Cray service team has worked hard to fix issues with our Lustre installation. The Lustre functionality could be restored without any data loss. The Lustre operation will be monitored the next 7 days.

We apologize for the long downtime and thank you for your patience.

Important note:

We like to remind all users that WORK on the Lustre filesystem is to be considered as a scratch file system. There is no backup available for WORK (see also our topic on data management). Please save important data to tapes on the archive file system PERM (see " Data Management").

(wwb/tst/gla)


Konrad: Recurrent Lustre problems in Berlin - system closed for user access
[2649] Jul 21, 2014

Due to recurrent Lustre (WORK filesystem) server crashes the Berlin HLRN complex "Konrad" is closed for user access since July 18, 10:50. Cray specialists are working on the problem.

We will post an update when the problem is solved and the system is open for user access again.

We apologise for the inconvenience.

(gla/wwb)


Resolved: Konrad: Lustre problems in Berlin
[2643] Jul 16, 2014

Update July 17, 17:00: Since Thursday, July 17, 17:00, Konrad is fully available again for user access.

The Lustre problems on the $WORK filesystem on the Berlin HLRN complex "Konrad" have been successfully resolved.

Thank you for your patience.

-----
The original announcement was:

Due to a Lustre ($WORK filesystem) problem the Berlin HLRN complex "Konrad" is closed for user access since Monday, July 14, afternoon. We are working with the vendor to further analyze and rectify the problems. Konrad will remain closed, probably until Thursday, July 17.

We will post a message when the problem is solved and the system is open again for user access.

We apologise for the inconvenience.

(gla/wwb)


Maintenance of the Archive System ($PERM) in Hannover extended
[2642] Jul 14, 2014

To allow further maintenance and testing, access to $PERM in Hannover
remains closed. We hope to have $PERM back online sometime tomorrow,
Tuesday 15.

Compute services in Hannover are not affected.

Regards, Steffen Schulze-Kremer



Maintenance of the Archive System ($PERM) in Hannover preponed to 18:00 today, Thursday 10. 7. 2014
[2637] Jul 10, 2014

To enable the overseas team of Cray to use our night time for the
maintenance of $PPERM, we have to close access to $PERM from 18h
today.

We hope to have $PERM back online by Monday morning.
Compute services in Hannover are not affected.

Regards, Steffen Schulze-Kremer


Original message was:
---------------------

Maintenance of the Archive System ($PERM) in Hannover

Due to software maintenance the $PERM filesystem in Hannover
will be unavailable from Friday, the 11th of July starting
at 9:00 until Monday, the 14th of July.

We apologise for the inconvenience.

(HS)



Maintenance of the Archive System ($PERM) in Hannover
[2634] Jul 10, 2014

Due to software maintenance the $PERM filesystem in Hannover
will be unavailable from Friday, the 11th of July starting
at 9:00 until Monday, the 14th of July.

We apologise for the inconvenience.

(HS)



Apply for Computing Time by July 28th, 2014
[2628] Jul 08, 2014

The Scientific Board of the HLRN is accepting project proposals applying for computing time on the HLRN system.

The next deadline is July 28th, 2014, at 23:59.

Projects that need more computing time than 5000 NPL resource units per quarter are invited to submit a project proposal ("Großprojektantrag") to the Scientific Board ("Wissenschaftlicher Ausschuss") of HLRN for the next quarterly review. Resources are allocated for one year starting October 1st, 2014, on a quarterly basis after review of the proposal (see the Application HowTo and the Scientific Board portal page). In 2014 HLRN allocates about 50 million core hours per quarter (equivalent to about 16 million NPL per year) to successful proposals.

Please note: For this application deadline the project abstract can optionally be created in a new format. The appropriate HLRN LaTeX template can be downloaded from the HLRN web page "required documents". From October 2014 on the use of this template will be required.

In case of questions, please contact your local HLRN support person or your HLRN project consultant before submitting the proposal.

(wwb)



Preannouncement: Extended downtimes in Berlin and Hannover in September and October
[2632] Jul 07, 2014

Due to the planned extension of the HLRN-III complexes in Berlin and Hannover there will be longer downtimes in September and October of this year.

According to the preliminary schedule "Konrad" in Berlin will be switched off at the beginning of September for at least three weeks until the beginning of the acceptance phase with limited user access.

"Gottfried" in Hannover will follow later on in September.

Please note that during the downtimes there will be no user access to data stored on the respective complex.

Please plan accordingly.

We will keep you updated as soon as the time schedule for the installation works is fixed.

(gla)



Berlin: Server maintenance at ZIB
[2625] Jun 27, 2014

Due to a scheduled inhouse server maintenance at ZIB all HLRN staff at Zuse Institute Berlin will have no email connectivity on Monday, June 30, from 7:00 - 17:00. In urgent cases please send email to support@hlrn.de or call the appropriate person on the phone.

(wwb)


Finished: Maintenance of the Archive (PERM) System in Hannover
[2619] Jun 24, 2014

The maintenance of the archive system in Hannover is finished. Both servers hperm1 and hperm2 are available now.

Please note that the host keys have changed, so you will get a "doing something nasty" message if you logged in before and the servers are in your .ssh/known_hosts file.

The new correct host key fingerprints are:
hperm1 2f:dd:d8:3e:ba:ac:c9:b0:65:f5:4b:7e:7d:57:ef:31
hperm2 d8:70:7d:f7:71:69:c3:04:bf:0d:8f:69:a9:53:28:54

Thank you for your patience,

(GG)


The original message was:
------------------------
In order to integrate several components with the tape archive system
(PERM) in Hannover, a down time is needed tomorrow afternoon (Wed
June 25, starting 15:00). No access to PERM in Hannover is possible during the maintenance. We hope the system will be up again within a few hours.



Newsletter: Sixteenth edition of HLRN Informationen published
[2616] Jun 17, 2014

The sixteenth edition of the HLRN newsletter HLRN Informationen (in German) is available for download at
https://www.hlrn.de/home/view/NewsCenter/NewsLetter.
This web page also contains instructions for (un-)subscribing to the newsletter mailing list.

From the content:
  • Installation of the HLRN-III phase 2 system
  • Project report on "Grenzen definieren, Grenzen überschreiten"
  • News about HLRN-III
(wwb)



Finished: Maintenance of Archive (PERM) System in Hannover
[2613] Jun 16, 2014

The tape archive system (PERM) in Hannover is available again. At this time, only access to hperm1 is possible.

Please take care to verify that you are not over your PERM quota when writing to the system, as your files will then be either incomplete or have a size of 0 bytes. This has been a frequent problem in the past when users wanted to restore incomplete data

Below you can find a short summary of what is important when working with quota on the PERM storage in case you are interested.

Thank you for your patience,

(GG)


Quota on PERM
--------------
The quota system on PERM has three "dimensions" (inodes/blocks, online/offline, soft/hard), resulting in eight (2 * 2 * 2) kinds of quota (use the command squota -k /qfs1/perm to check yours):

  • inodes: the number of files you can store

  • blocks: the amount of data allowed

  • online: the storage available on the disk cache speeding up transfers to/from tape; get more space using the release command to clean your files from the cache and only retain the tape copy

  • offline: the storage permissible on tape

  • soft: the amount permanently allowed. If you go over soft quota, the grace period timer will start; if this timer runs out, you may not store more data on the system

  • hard: the amount you can not exceed at any time




Finished: Installation works in Hannover
[2602] Jun 13, 2014

The infrastructure work in Hannover is finished.
The system is open for user access.

The $PERM file system will probably be available again
next Monday.

Yours sincerely
HLRN Team

(HS)


Maintenance for Archive (PERM) System in Hannover
[2600] Jun 06, 2014

We are sorry to inform you that Cray found another bug in
our TAS installation which they want to fix as soon as possible.

For that reason we now close access to hperm. We do not
have an estimate as to when that maintenance will be finished.

However, due to the extended weekend in Germany it is unlikely
that we will be able to reopen the archive system in Hannover
before Tuesday, 10. 6. 2014.

We apologize for the nuisance.

Steffen Schulze-Kremer


Installation works in Hannover next week June 11. / 12.
[2597] Jun 05, 2014

HLRN Hannover will be shut down Wednesday next week (June 11.) due to infrastructure work for the installation of the next phase.

Access will be disabled after 08.00 a.m. Wednesday, and the system will not be booted before Thursday evening.

We apologise for the inconvenience.

(GG)


Update: PERM in Hannover available again
[2594] Jun 05, 2014

Access to the HLRN archive system (hperm1 only, for the time being) in Hannover finally is open. This is new hardware that got assigned the old name for convenience, so you may first need to accept the new host key when logging into the server. A message will appear like

WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!

In this case you can accept the new host key by following the instructions that come with the message. Verify the host key fingerprint to be e5:31:69:ab:c6:60:df:77:0d:ed:65:02:1a:13:18:c3.

We repeat this is a new system, so since it is an archive, please take care to verify files you put there via sha1sum or something similar. Please be aware the system officially still is in the acceptance phase. So do not hesitate to notify staff in case you notice anything unusual.

To log in or to copy files, you will need to come from one of the HLRN data servers (hdata1, hdata2 or bdata1, bdata2) as was the case with the old installation. Example:
hdata1: $ scp myfiles hperm1.hsn.hlrn.de:/qfs1/perm/myid/mydir

You could also login to hperm1 from within the HLRN and from there initiate a copy to or from an outside server which is a member of the user configurable list of access hosts. The connection has to be initiated from hperm1 for the time being.


(GG)


Update: PERM in Hannover unavailable
[2588] May 21, 2014

Access to hperm1 and hperm2 in Hannover is still closed.

Parts of the new archive hardware show an inconsistent
configuration which needs to be fixed before going
into production.

We will inform you as soon as the new system is ready.


Sorry for the trouble,

(GG)


Maintenance of $PERM in Hannover
[2585] May 08, 2014

Due to hardware maintenance the $PERM filesystem in Hannover
will be unavailable from Monday, the 12th of May until
Friday, the 16th of May.

We apologise for the inconvenience.

(HS)


Verwaltungsrat passed Entgeltordnung
[2583] May 05, 2014

On April 17th 2014, the HLRN Verwaltungsrat passed the new Entgeltordnung governing the charges for user access to the HLRN service. For further information, please visit
https://www.hlrn.de/home/view/Organisation/EntgeltOrdnung.

(ml)


Konrad: Network failure over the weekend
[2575] Apr 28, 2014
Failing hardware has caused a massive network outage in the Berlin XC30 at 2:04 on Saturday, April 26, 2014. A lot of compute nodes crashed and were set down, thereby terminating several batch jobs. Please contact HLRN Support if you lost a significant amount of NPL due to crashing jobs.

Cray is still working on the problem. Konrad is only partially available and may require a full reboot for proper operation, probably tomorrow morning (Tuesday, April 29).

We apologize for the inconvenience.

(wwb)


Archive servers in Berlin unavailable on April 29, 2014
[2573] Apr 24, 2014

Due to a hardware maintenance the archive servers (bperm1, bperm2) and the tape archive $PERM in Berlin will be unavailable on Tuesday, April 29, 2014, from 8:00 until late afternoon.

We will post a note when the archive servers are available again.

Please plan accordingly.

(wwb)


Finished - Maintenance on archive system in Hannover
[2569] Apr 22, 2014

Maintenance on the archive system in Hannover is finished.

Thank you for your patience,

Your HLRN Team

(GG)


The original message was:
-------------------------
Due to proactive maintenance, the PERM file system Hannover will be unreachable for a few hours tomorrow starting 13:00. The Berlin partition will not be affected.



Maintenance on SMP system in Hannover finished
[2567] Apr 10, 2014

The Maintenance of the SMP system in Hannover is finished.

Thank you for your patience.

Yours sincerely
HLRN Team

(HS)





Apply for Computing Time by April 28th, 2014
[2565] Apr 07, 2014

The Scientific Board of the HLRN is accepting project proposals applying for computing time on the HLRN system.

The next deadline is April 28th, 2014, at 23:59.

Projects that need more computing time than 5000 NPL resource units per quarter are invited to submit a project proposal ("Großprojektantrag") to the Scientific Board ("Wissenschaftlicher Ausschuss") of HLRN for the next quarterly review. Resources are allocated for one year starting July 1st, 2014, on a quarterly basis after review of the proposal (see the " Application HowTo" and the Scientific Board portal page). In 2014 HLRN allocates about 50 million core hours per quarter (equivalent to about 16 million NPL per year) to successful proposals.

Please contact your local HLRN support person or your HLRN project consultant before submitting the proposal.

(wwb)



Maintenance on SMP system in Hannover
[2562] Apr 07, 2014

Due to maintenance the SMP system in Hannover will not be
available on Wednesday, Apr 9th, 2014.

Yours sincerely
HLRN Team

(HS)



PERM Maintenance in Hannover Wed Mar 26 09:00 a.m.
[2555] Mar 20, 2014

PERM will be unavailable in Hannover due to scheduled maintenance on Wed Mar 26, starting 09:00 a.m.

We expect the system to be available again later the same day.

(GG)



Finished: Maintenance in Hannover
[2553] Mar 19, 2014

Hannover is available again.

Thank you for your patience,
Your HLRN team

(GG)


The original message was:
------------------------
Due to maintenance the HLRN complex in Hannover will not be
available from Tuesday, Mar 18th, 2014 9:00, until at least
Wednesday, Mar 19th, 2014 12:00.



Maintenance in Hannover
[2549] Mar 13, 2014

Due to maintenance the HLRN complex in Hannover will not be
available from Tuesday, Mar 18th, 2014 9:00, until at least
Wednesday, Mar 19th, 2014 12:00.


We will post a note when Hannover is available again for user access.

Yours sincerely
HLRN Team

(HS)



Solved: access to PERM in Hannover
[2547] Mar 11, 2014

Login to the hperm servers should work again. Please do not hesitate to report any problems you might find.

(GG)


The original message was:
--------------------------
We have a problem with the LDAP connection to the PERM servers in Hannover [...]


Access to PERM in Hannover with problems
[2544] Mar 10, 2014

We have a problem with the LDAP connection to the PERM servers in Hannover, meaning that you will have difficulties logging onto hperm1 and hperm2 at certain times. We will try to fix the issue tomorrow.

(GG)




Detailed job accounting information available
[2542] Feb 27, 2014

The detailed information of NPL usage on a per-job basis is now available via the HLRN Service Portal.
  • For your personal account and all project accounts you have access to, please click on "Informationen zu einer Benutzerkennung" and scroll down to "Accounting Informationen". Here you will find a button "Zeige Jobs dieses Kontos seit Quartalsbeginn" below each account. Pressing that button will show all jobs for this account in the current quarter.
  • For a quick access to all jobs of a project account, please click "Informationen zum Projekt" (you must be member of that project) and scroll down to "Konto-Informationen für das Projekt-Konto". Here you will also find the button "Zeige Jobs des Projektes seit Quartalsbeginn" which will show all jobs for this project account in the current quarter.
(sw/wwb)


Maintenance of PERM in Hannover finished
[2537] Feb 24, 2014

The Maintenance of the PERM filesystem in Hannover is finished.

Thank you for your patience,

(GG)


The original message was:
--------------------------
On Monday, the 24th of February, the $PERM filesystem in Hannover
will be unaccessible due to maintenance.



Maintenance of $PERM in Hannover
[2534] Feb 21, 2014

On Monday, the 24th of February, the $PERM filesystem in Hannover
will be unaccessible due to maintenance.

We apologise for the inconvenience.

(HS)


Berlin Maintenance on Tue, Feb 25, 2014
[2531] Feb 20, 2014


In order to improve Lustre stability in Berlin a hardware maintenance will be performed on the Berlin complex Konrad from Tuesday, Feb 25, 9:00 until Wednesday, Feb 26, morning. [Update Feb 26, 9:15: The maintenance has to be extended until Wed, Feb 26, afternoon.]
Konrad will be unavailable during this time. All login sessions will be terminated and the batch system will be stopped.

We will post an update when Konrad is available again.

(wwb)




Finished: Maintenance of $PERM in Hannover
[2527] Feb 14, 2014

The PERM (/qfs1) filesystem in Hannover is available again.

Thank you for your patience.

(HS)




Maintenance of $PERM in Hannover
[2525] Feb 13, 2014

On Friday, the 14th of February, the $PERM filesystem in Hannover
will be unaccessible due to maintenance.

We apologise for the inconvenience.

(HS)


Berlin Maintenance on Tue, Feb. 11, 2014
[2520] Feb 06, 2014

Hardware maintenance will be performed on the Berlin complex Konrad from Tuesday, Feb 11, 9:00 until Wednesday, Feb 12, morning. Konrad will be unavailable during this time. All login sessions will be terminated and the batch system will be stopped.

We will post an update when Konrad is available again.

(wwb)




Finished: Hannover Maintenance on Tue, Feb. 4th
[2518] Feb 05, 2014

The maintenance off the HLRN complex in Hannover is
finished. The system is fully available again.

Yours sincerely
HLRN Team

(HS)



HLRN Hannover available again
[2508] Jan 29, 2014

HLRN Hannover is available again.

(GG)


The original message was:
-------------------------
The HRLN 3 system in Hannover just executed an emergency power off due to a defective valve in the water cooling system. We are working on the problem.




HLRN-III Hannover - Lustre Problem solved
[2505] Jan 15, 2014

The HRLN complex at Hannover is available for users again.

(HS)




HLRN-III - Lustre problem in Hannover
[2503] Jan 15, 2014


Due to a problem with the Lustre file system ($WORK)
the HLRN-III complex at Hannover is closed.

We will post a message when the problem is solved and apologize
for the inconvenience.

(HS)




NPL accounting active since January 1st, 2014
[2502] Jan 13, 2014

On January 1, 2014 regular NPL accounting started on HLRN-III in Berlin and Hannover.

This means that computing time consumed by batch jobs is charged to the quarterly granted NPL allocation (project allocation or personal allocation). Job submit is possible only as long as the allocation is not used up.

See also our web page on the user, project, and allocation management policies and options.

(wwb)


Apply for Computing Time by January 28th, 2014
[2494] Jan 10, 2014

The Scientific Board of the HLRN is accepting project proposals applying for computing time on the HLRN system.

The next deadline is January 28th, 2014, at 23:59.

Projects that need more computing time than 5000 NPL resource units per quarter are invited to submit a project proposal ("Großprojektantrag") to the Scientific Board ("Wissenschaftlicher Ausschuss") of HLRN for the next quarterly review. Resources are allocated for one year starting April 1st, 2014, on a quarterly basis after review of the proposal (see the " Application HowTo" and the Scientific Board portal page). In 2014 HLRN allocates about 50 million core hours per quarter (equivalent to about 16 million NPL per year) to successful proposals.

Please contact your local HLRN support person or your HLRN project consultant before submitting the proposal.

(wwb)



Hannover: End of HLRN-II Operation January 17th, 2014
[2497] Jan 06, 2014

Dear HLRN-User,

HLRN-II will finally be taken offline on January 17th, 2014. Until then you may continue to use HLRN-II.

Please make sure that you have all your data migrated from HLRN-II to HLRN-III by January 17th, since there will be no access to HLRN-II file systems after January 17th.

Sincerely, Steffen Schulze-Kremer


Berlin Maintenance on Tue, Jan. 7th **extended**
[2492] Jan 02, 2014

Update Thu, Jan 9th, 16:00: The maintenance has to be extended until Friday, January 10th.

The original announcement was:

A maintenance has been scheduled on the Berlin Cray system Konrad from Tuesday, Jan 7th, 2014 9:00, until Wednesday, Jan 8th, 2014 afternoon. The Berlin Cray complex will be fully unavailable during this time. At 9:00 on Tuesday all login sessions will be terminated and the batch system will be stopped.

We will post a note when Berlin is available again for user access.
Please plan accordingly.

(wwb)



This topic: NewsCenter > Archive2014
Topic revision: r1 - 2015-05-27 - WolfgangBaumann
 
Back to top of page