Saturday 3 December 2011

Subversion "fun"

Well, although git is seems to be better from subversion even for our needs, we still use and will be using subversion in order to provide a long-term support for our customers, those who might require some changes in some old projects to be made... Anyway, the story is not about that...


"It was a rainy Thursday". As we could rely on our monitoring system, nobody was expecting some sudden troubles and I was doing some usual stuff... Suddenly... Suddenly we receive several calls at one time. All users reported some problem with accessing the SVN server. In a few second monitoring system reported: 100% use of the / (root) partition. WTH?! There should be two thresholds warnings before 90% and 95%! Well, login in, check.. Well... subversion repos are on another partition, so it's some system stuff... Logs are normal size - logrotate is configured for that. Aha!.. Several gigs in /tmp.... in a single file... Well, # lsof|grep filenname and we have a guilty. It was the svnsync process (we put it on postcommit hook to keep remote mirror synchronized), fortunately with a path to the repository to sync.

Now, let's see what was the number of the last synchronized revision on remote server (let's call it N), and then take a look on the N+1 revision. A single file was committed, named other_repo_name.tar.gz. With size of few gigs...

First of all block the synchronization to remote mirrors and block all commits (exit 1 in the pre-commit hook. Don't forget to put some message for users, sth like `echo "Commits are blocked, please contact support for more details" 1>&2`). Seems like we will need to cut this commit from the DB. Now, when further risks are eliminated, let's call to talk with the user.

- Hi Mike (let's call him Mike), this is John Smith (let's call me John :)) from the IT dept. We noticed that you have committed a few gigs file to the XXXX repo, didn't you?
- Hi, yes I did.
- But do you know that we have other services to store archives, like _service1_ and _service2_? Instead, our SVN servers are used as a version control system.
- Yes I know, that's why I put there a copy of the repository YYYY (Note: YYYY is an another repo on the same server), but since it was too big in sources I have compressed it to not take too much space on the server...
- ... *confused... (describing users the basics of work with services like SVN is far from my responsibility). Well, Mike, I understand. Thank you for description. Just to inform you, XXXX repo will be not available for commits for few hours. Bye...


Next, I called to Mike's department boss. Not to claim, but just to describe the situation and to ask him to inform his team that repo will be unavailable for commits for few hours (dump/recover), and that one commit that has been made after the huge one will have to be re-committed gain.

The moral of the story: On the important servers keep /tmp on a separate partition! I do now :)

No comments:

Post a Comment