• CommentRowNumber1.
• CommentAuthorTobyBartels
• CommentTimeJul 29th 2009

I just wrote an addition to regular space, and I can't even tell if it got saved properly. There's no response from the Lab (just a long wait, no error message, until it times out), even though I've restarted it twice.

Incidentally, it's been at least a week since I've been able to get more than two characters into the terminal at a time. Even to restart it with ~/x, I have to log in twice!

• CommentRowNumber2.
• CommentAuthorTobyBartels
• CommentTimeJul 29th 2009

From wget, I get ‘ERROR 504: Gateway Timeout.’.

/etc/init.d/instiki status says ‘instiki dead but subsys locked’; anything besides status gives an error that /usr/local/instiki/tmp/pids/server.pid doesn't exist or that instiki is already running. But ps -Al gives nothing that looks relevant (except for lighttpd).

Also, apparently the reason why I could never get more than two characters in is that typing ~ will always freeze it! Otherwise, the ssh terminal is doing just fine. So cd; ./x is probably best from now on.

• CommentRowNumber3.
• CommentAuthorAndrew Stacey
• CommentTimeJul 29th 2009

When loaded from /etc/init.d/instiki then it creates a lockfile. The lockfile isn't created by instiki, but by the script in /etc/init.d. Thus if instiki crashes, the lockfile isn't removed and the script won't restart. So it's necessary to remove the lockfile before relaunching it.

rm /var/lock/subsys/instiki

But one should check that instiki isn't running before removing the lockfile (via ps auxc).

• CommentRowNumber4.
• CommentAuthorAndrew Stacey
• CommentTimeJul 29th 2009
• (edited Jul 29th 2009)

Oh, and some implementations of ssh use ~ as an escape character. To quote:

-e escape_char

Sets the escape character for sessions with a pty (default: ‘~’). The escape character is only recognized at the beginning of a line. The escape character followed by a dot (‘.’) closes the connection; followed by control-Z suspends the connection; and followed by itself sends the escape character once. Setting the character to “none” disables any escapes and makes the session fully transparent.

So you could avoid the problem by doing ssh -e '#' or something like that.

• CommentRowNumber5.
• CommentAuthorTobyBartels
• CommentTimeJul 29th 2009

Or just type ~ twice. Jeez, why didn't I ever try that? Even accidentally!

Thanks for /var/lock/subsys/instiki; I knew that there was a lock file somewhere, but I never knew where it was.

• CommentRowNumber6.
• CommentAuthorUrs
• CommentTimeJul 29th 2009

We need some method to archive important results of discussions here. In five month the nLab may halt while I am working on it and I will have to try to remember where it was that I saw you two chat about lockfiles.

A good idea might be to split off from the nLab's HowTo page an AdminHowTo page that lists answers to issues like these.

• CommentRowNumber7.
• CommentAuthorAndrew Stacey
• CommentTimeJul 30th 2009

It's a moot point, but I would put this sort of stuff on the 'nlabmeta' web. Most people don't need to know about this stuff, and most people couldn't do anything about it even if they did know it.

But you're right. We do need to archive some of the stuff here. Some of it does get done (like how to do redirects) but it's not as simple as just copying the solution to a problem across from here to there. For example, take the 'downloading the n-lab' thread. There's a useful script there that if someone wants to download the n-lab then they should know. However, if someone hasn't yet had the inclination to download the entire n-lab then I'd rather not put the idea into their heads as if everyone does it then there goes our bandwidth!

We probably have three categories of information:

1. Stuff everyone might find useful. Such as how to do redirects, or include SVGs.

2. Stuff that anyone could find useful, but isn't necessary for ordinary use of the n-lab. Such as downloading the whole lot.

3. Stuff that only the lab elves need to know. Such as how to reboot the server.

I'd recommend: n-lab HowTo for the first, n-labmeta HowTo for the second, and a lab elves technical page for the third.

I'm not saying that any information should be hidden or deliberately obscured, but that it is layered in such a fashion that the most likely and useful information is encountered first.

• CommentRowNumber8.
• CommentAuthorTim_Porter
• CommentTimeDec 25th 2010
• (edited Dec 27th 2010)

The Lab seems to be out of action. Perhaps it has a hang-over after celebrating too much last night!!! (For the visible record, today is Christmas day)

• CommentRowNumber9.
• CommentAuthorTobyBartels
• CommentTimeDec 26th 2010

It works now for me.

• CommentRowNumber10.
• CommentAuthorMike Shulman
• CommentTimeDec 27th 2010

I couldn’t get to it just now, so I restarted it (again?).

• CommentRowNumber11.
• CommentAuthorRodMcGuire
• CommentTimeDec 31st 2010

nLab seems down to me for a few hours (after a long time it returns a blank page for any request)

• CommentRowNumber12.
• CommentAuthorMike Shulman
• CommentTimeDec 31st 2010

I restarted it again.

• CommentRowNumber13.
• CommentAuthorTim_Porter
• CommentTimeDec 31st 2010

@Rod Try again. It gave me a blank to start with but is now working normally it seems.

• CommentRowNumber14.
• CommentAuthorUrs
• CommentTimeDec 31st 2010
• (edited Dec 31st 2010)

I have just restarted it once more.

I noticed the last two days, the nLab had a multitude of down-times of a few minutes, from which it did recover – I believe these revoverys are due to Andrew’s recent modification which makes the instiki software restart on a regular basis of minutes or something like this

So I am wondering about two things:

• even if it does recover, why is it down so many times per day?

• why does the automatic recovery still fail some time?

• CommentRowNumber15.
• CommentAuthorDavidRoberts
• CommentTimeJan 1st 2011

Lab is down again.

• CommentRowNumber16.
• CommentAuthorTim_Porter
• CommentTimeJan 1st 2011

It still seems to be down. Happy New Year everyone. 2011 has got here ’safely’.

• CommentRowNumber17.
• CommentAuthorTim_Porter
• CommentTimeJan 1st 2011
• (edited Jan 1st 2011)

4 minutes later) Its automatic reboot worked :-) It is back.

• CommentRowNumber18.
• CommentAuthorMike Shulman
• CommentTimeJan 1st 2011

Actually, I just restarted it manually.

• CommentRowNumber19.
• CommentAuthorTim_Porter
• CommentTimeJan 1st 2011

@Mike. Can’t be right always! so doubly Happy New Year to you.

• CommentRowNumber20.
• CommentAuthorzskoda
• CommentTimeJan 4th 2011

Again down ? I can not access the nLab now (16:25 CET)

• CommentRowNumber21.
• CommentAuthorUrs
• CommentTimeJan 4th 2011

I have restarted it.

• CommentRowNumber22.
• CommentAuthorTim_Porter
• CommentTimeJan 5th 2011
• (edited Jan 5th 2011)

It seems to be down again.

Later: What is the situation about automatic restart since it has been out for at least 15 minutes now?

• CommentRowNumber23.
• CommentAuthorUrs
• CommentTimeJan 5th 2011

I have restarted the server.

• CommentRowNumber24.
• CommentAuthorTobyBartels
• CommentTimeJan 8th 2011

Down again, and I’m not at a place where I can get in to the server. (I don’t have my key with me.)

• CommentRowNumber25.
• CommentAuthorDavidRoberts
• CommentTimeJan 8th 2011

Yeah, I was just about to say. Someone should host Jim’s recently uploaded article in a stable place for now and change the link at the cafe, as it reflects badly on the cafe not being able to satisfy links.

• CommentRowNumber26.
• CommentAuthorUrs
• CommentTimeJan 8th 2011

I have restarted the server.

• CommentRowNumber27.
• CommentAuthorUrs
• CommentTimeJan 8th 2011

it reflects badly on the cafe not being able to satisfy links.

I find that the smallest problem. What worries me is the Lab being down. And what really worries me is: the $n$Journal-to-be not being up.

I wish I knew what we could do.

• CommentRowNumber28.
• CommentAuthorUrs
• CommentTimeJan 8th 2011

I have restarted it again.

• CommentRowNumber29.
• CommentAuthorRodMcGuire
• CommentTimeJan 19th 2011

long waits before if finally returns an empty page. Persisting for about 1 hour now.

• CommentRowNumber30.
• CommentAuthorUrs
• CommentTimeJan 19th 2011

I have restarted it now.

• CommentRowNumber31.
• CommentAuthorAndrew Stacey
• CommentTimeJan 19th 2011

We’ve been getting some memory spikes recently, with the instiki processes going up to about 3 times their usual level. When I’ve spotted them, I’ve done a “soft reset” which has worked fine. When I get a bit of time, I’ll track down what’s causing them and report back to Jacques.

• CommentRowNumber32.
• CommentAuthorDavidRoberts
• CommentTimeJan 27th 2011

It’s down again…

• CommentRowNumber33.
• CommentAuthorUrs
• CommentTimeJan 27th 2011

• CommentRowNumber34.
• CommentAuthorzskoda
• CommentTimeJan 31st 2011
• (edited Jan 31st 2011)

I can’t load nLab, including my personal pages. Edit: after few minutes succeeded. Edit later: veeery long loads continue. Edit even later: much better now, almost normal!

• CommentRowNumber35.
• CommentAuthorzskoda
• CommentTimeFeb 1st 2011

nLab does not load now (Feb 1, 15:24)

• CommentRowNumber36.
• CommentAuthorTim_Porter
• CommentTimeFeb 1st 2011
• (edited Feb 1st 2011)

It has been off for some time. (Edit: it is back 14.39 UK)

• CommentRowNumber37.
• CommentAuthorUrs
• CommentTimeFeb 1st 2011

I have now restarted it.

• CommentRowNumber38.
• CommentAuthorjcmckeown
• CommentTimeFeb 3rd 2011

Down, 22:02:24 UTC 2011

• CommentRowNumber39.
• CommentAuthorUrs
• CommentTimeFeb 4th 2011

restarted

• CommentRowNumber40.
• CommentAuthorTim_Porter
• CommentTimeMar 16th 2011

It seems to be down again.

• CommentRowNumber41.
• CommentAuthorUrs
• CommentTimeMar 16th 2011
• (edited Mar 16th 2011)

I have restarted it.

(Would have done it an hour ago, had I not had wild problems with my internet connection.)

• CommentRowNumber42.
• CommentAuthorTim_Porter
• CommentTimeMar 16th 2011

Thanks

• CommentRowNumber43.
• CommentAuthorUrs
• CommentTimeMar 16th 2011

It might be a good idea if you (Tim) for instance also got access to the $n$Lab server. Since we are back to the point where it needs restarting once of twice a day, it would be good to have more people be able to do so. If you are willing to help with this, I suppose Andrew would be glad to create an account for you. (You need a public key, though. In case you are not familiar with the trouble one has to go through for this, I can send you a step-by-step list for what to do. )

• CommentRowNumber44.
• CommentAuthorAndrew Stacey
• CommentTimeMar 16th 2011

The steering committee should decide on this, but certainly I have no problems with Tim having reboot access to the nLab.

What I’d really like is for someone who knows a thing or two about what might be the problem to tell me what to look for!

• CommentRowNumber45.
• CommentAuthorTim_Porter
• CommentTimeMar 20th 2011
• (edited Mar 20th 2011)

Guess what …..

PS still down 6 hours later.

Phew! back at last (8 hours).

• CommentRowNumber46.
• CommentAuthorUrs
• CommentTimeMar 21st 2011

I was offline all weekend. Was sick.

• CommentRowNumber47.
• CommentAuthorTim_Porter
• CommentTimeMar 21st 2011

Hope you get well soon.

Is the n-lab set for automatic reboot still?

• CommentRowNumber48.
• CommentAuthorUrs
• CommentTimeMar 21st 2011

Is the n-lab set for automatic reboot still?

I think it is, but for some reason the automatic reboot doesn’t always work. I suppose when it hangs it hangs so badly that it cannot do anything anymore, hence also not reboot itself.

• CommentRowNumber49.
• CommentAuthorAndrew Stacey
• CommentTimeMar 21st 2011

Actually, the automatic reboot stuff doesn’t seem to work as I’d hoped it would. It seems that it goes in to the restart cycle, but the restart isn’t in place by the time it next checks, so it assumes that the restart didn’t wok and goes into a sulk. I need to investigate other systems.

• CommentRowNumber50.
• CommentAuthorUrs
• CommentTimeMar 22nd 2011

Can’t we just set a cron job that calls the command which we call by hand every now and then?

• CommentRowNumber51.
• CommentAuthorTim_Porter
• CommentTimeApr 8th 2011

The lab would seem to have gone down again.

• CommentRowNumber52.
• CommentAuthorUrs
• CommentTimeApr 8th 2011

I seem to have restarted it.

• CommentRowNumber53.
• CommentAuthorTim_Porter
• CommentTimeApr 8th 2011

It is working now, thanks.

• CommentRowNumber54.
• CommentAuthorTim_Porter
• CommentTimeApr 9th 2011

Down again.

• CommentRowNumber55.
• CommentAuthorUrs
• CommentTimeApr 9th 2011

I have restarted it.

Tim, should we try to give you access to the nLab server? Would you be willing to? Given the number of your lab-down-reports, it would be really good for the $n$Lab community and for you, it seems, if you could restart the lab.

• CommentRowNumber56.
• CommentAuthorzskoda
• CommentTimeApr 12th 2011
• (edited Apr 12th 2011)

$n$Lab seems not responsive at the moment. Edit: 10 minutes later: $n$Lab is back.

• CommentRowNumber57.
• CommentAuthorUrs
• CommentTimeApr 12th 2011
• (edited Apr 12th 2011)

As long as myself I am working on the lab, I notice quickly when it goes down. Trouble begins when I am not myself working on it.

• CommentRowNumber58.
• CommentAuthorzskoda
• CommentTimeApr 13th 2011
• (edited Apr 13th 2011)

Edit: I could not get the response from $n$Lab for last several minutes, but just now it is back again, but somewhat slow. New entry Maschke’s theorem.

• CommentRowNumber59.
• CommentAuthorTim_Porter
• CommentTimeApr 19th 2011

Topically as far as another thread goes, the Lab seems to be down. Although this does not apply this time, I seemed to notice that it crashes at weekends. Is this true? Is it internal to the software (e.g. a routine scheduled garbage collection or something like that) which overloads it, or is there some external source such as someone downloading new material to have a personal copy.

• CommentRowNumber60.
• CommentAuthorUrs
• CommentTimeApr 19th 2011
• (edited Apr 19th 2011)

Topically as far as another thread goes, the Lab seems to be down.

I have now restarted it.

Although this does not apply this time, I seemed to notice that it crashes at weekends. Is this true?

It crashes several times each day. But during the week I am usually online and notice it fairly quickly, and restart it without always dropping a note here. On weekends even I am offline for longer periods. I think that explains it.

But I will now try to record every single time that I restart the server, so that we get a better idea.

• CommentRowNumber61.
• CommentAuthorUrs
• CommentTimeApr 19th 2011

have restarted the Lab

• CommentRowNumber62.
• CommentAuthorMarc
• CommentTimeApr 19th 2011
Does not seem to work. An hour ago I edited "extensive categories" without any reaction upon submission.
• CommentRowNumber63.
• CommentAuthorUrs
• CommentTimeApr 19th 2011

It seems to have recovered without me doing anything.

• CommentRowNumber64.
• CommentAuthorUrs
• CommentTimeApr 20th 2011

have restarted the server

• CommentRowNumber65.
• CommentAuthorUrs
• CommentTimeApr 20th 2011

I have restarted the $n$Lab.

But then something curious happened: the pages that I was waiting for to display did display the instant that my finger touched the enter-key to send off the restart command.

Maybe a coincidence. But it did happen before to me. So I thought I’d mention it.

• CommentRowNumber66.
• CommentAuthorUrs
• CommentTimeApr 20th 2011

generally, the lab is very slow this afternoon. That makes it hard to tell wheteher it’s down or just being lazy.

• CommentRowNumber67.
• CommentAuthorUrs
• CommentTimeApr 20th 2011

I have restarted it again

• CommentRowNumber68.
• CommentAuthorzskoda
• CommentTimeApr 24th 2011

Happy Easter. But lab down. Apr 24, 11:22 CET.

• CommentRowNumber69.
• CommentAuthorTim_Porter
• CommentTimeApr 24th 2011

It has been down since 6.30 BST at least. And Happy Easter to everyone.

• CommentRowNumber70.
• CommentAuthorzskoda
• CommentTimeApr 24th 2011

Well, I do not like common holidays for the plain reason that most of the things do not function, are closed or are forbidden to do during them. This time is the $n$Lab.

• CommentRowNumber71.
• CommentAuthorzskoda
• CommentTimeApr 24th 2011

Well now the problems with $n$Forum as well. It takes several minutes to refresh the preview of a post.

• CommentRowNumber72.
• CommentAuthorTim_Porter
• CommentTimeApr 24th 2011

Some of that may be due to lots of people downloading megalength films somewhere between you and the main ’superhighway’. (After one preview: That was not too bad.)

• CommentRowNumber73.
• CommentAuthorzskoda
• CommentTimeApr 24th 2011
• (edited Apr 24th 2011)

No there is some other problem. Somehow I can not get the double dollar sign work consistently. When it does not like it, like usual

it runs it for minutes and returns at the end the page without he formula.

$A$

(Look at the source of this)

• CommentRowNumber74.
• CommentAuthorTim_Porter
• CommentTimeApr 24th 2011

I have the same problem.

• CommentRowNumber75.
• CommentAuthorAndrew Stacey
• CommentTimeApr 24th 2011

When the lab is down, then the rendering of maths here won’t work either. That’s because the actual conversion takes place on the same server as the nlab due to the fact that it needs something beyond what I’m allowed to run here.

I’ve just restarted the lab. My apologies for the long delay in restarting.

• CommentRowNumber76.
• CommentAuthorTim_Porter
• CommentTimeApr 24th 2011

thanks, Andrew. It is good to know the reason.

• CommentRowNumber77.
• CommentAuthorUrs
• CommentTimeApr 25th 2011

Maybe Tim’s observation above was right after all: as far as I can see the lab was down consistently at the very end of the week, somewhere from Sunday to Monday during the last weeks, maybe longer.

• CommentRowNumber78.
• CommentAuthorTobyBartels
• CommentTimeApr 25th 2011

I just realised that this post (which predates the Forum categories) is in the Atrium; I just moved it to Technical Matters.

• CommentRowNumber79.
• CommentAuthorUrs
• CommentTimeApr 26th 2011

I have restarted the server.

• CommentRowNumber80.
• CommentAuthorAndrew Stacey
• CommentTimeApr 26th 2011

What were the symptoms that time? I’m intrigued because I happened to be logged in as root when you restarted it and so when I noticed that you restarted it, I checked the logs and couldn’t see any of the usual suspects.

• CommentRowNumber81.
• CommentAuthorUrs
• CommentTimeApr 26th 2011

I keep restarting it, but it does not come back at the moment.

• CommentRowNumber82.
• CommentAuthorUrs
• CommentTimeApr 26th 2011

What were the symptoms that time?

Same as always: after calling either show or save nothing happens for a minute of so, and then an error message appears.

• CommentRowNumber83.
• CommentAuthorUrs
• CommentTimeApr 26th 2011

Now it’s back.

• CommentRowNumber84.
• CommentAuthorUrs
• CommentTimeApr 26th 2011

It’s down again. I can’t even restart it, because even the command line is not reacting. (Had this before throughout the mrning. It has been immensely slow ever since two hours ago or so).

• CommentRowNumber85.
• CommentAuthorUrs
• CommentTimeApr 26th 2011

Now I have restarted it again. I have received one response of one page, but am already waiting again for the second page.

By the way, it is this kind of very frustrating experience that made me ask last time: what is our perspective? Do we need to stick with this?

• CommentRowNumber86.
• CommentAuthorAndrew Stacey
• CommentTimeApr 26th 2011

If the commandline is also slow then that’s nothing to do with Instiki but is to do with the connection between your computer and the server.

In poking around just now I discovered that the daily updates weren’t getting applied due to an issue with a dependency. I’ve just fixed that.

• CommentRowNumber87.
• CommentAuthorUrs
• CommentTimeApr 26th 2011
• (edited Apr 26th 2011)

If the commandline is also slow then that’s nothing to do with Instiki but is to do with the connection between your computer and the server.

Sure, but since at the same time my computer happily accesses all kinds of sites, it seems to indicate that the server that the $n$Lab is running on is busy with something else.. Might be a hint as to what the problem is, maybe, I thought.

By the way, I have just restarted once more. This time calling any page produced a Passenger error message saying that Ruby on Rails could not be started.

• CommentRowNumber88.
• CommentAuthorAndrew Stacey
• CommentTimeApr 26th 2011

Sure, but since at the same time my computer happily accesses all kinds of sites, it seems to indicate that the server that the nLab is running on is busy with something else.

Not necessarily. Remember that we’re using a VPS on some system, so there are several links in the chain between your computer and the nLab server, any one of which might be causing the slowdown.

• CommentRowNumber89.
• CommentAuthorUrs
• CommentTimeApr 29th 2011
• (edited Apr 29th 2011)

it went really well the last days, but now I had to restart the Lab again.

• CommentRowNumber90.
• CommentAuthorAndrew Stacey
• CommentTimeApr 29th 2011

Maybe the lab is secretly a republican (in the UK sense of the word, not the American).

• CommentRowNumber91.
• CommentAuthorTim_Porter
• CommentTimeApr 30th 2011

It looks as if the Lab has been to a late night street party and has a hang over! It has not yet got up. Can someone turn on the alarm clock and wake it up. (In case non-Brits are unaware of the events in the UK yesterday, there was a royal wedding! As a further completely irrelevant fact, the newly weds live not far from me on Anglesey so there were ‘street parties’… in the streets! At least the weather was good.) End of irrelevant comments… can someone restart the Lab please.

• CommentRowNumber92.
• CommentAuthorUrs
• CommentTimeApr 30th 2011

restarted

• CommentRowNumber93.
• CommentAuthorAndrew Stacey
• CommentTimeMay 1st 2011

Something struck me this morning. One reason why it might go down so often on Sundays might be because that is the day when the system does a full backup. When it does that, it locks the database so that no more information can be written to it. Due to the size of the database, the time that it is locked might be significant. And if instiki tries to access it while it is locked, then that might cause Something Bad.

I’ll investigate further.

• CommentRowNumber94.
• CommentAuthorUrs
• CommentTimeMay 9th 2011

I’ve had to restart the Lab

• CommentRowNumber95.
• CommentAuthorzskoda
• CommentTimeMay 9th 2011

There was quite a time gap between 93 and 94…is it now a little more reliable ?

• CommentRowNumber96.
• CommentAuthorDavidRoberts
• CommentTimeMay 11th 2011

Not sure, but it seems to be down again at the moment.

• CommentRowNumber97.
• CommentAuthorUrs
• CommentTimeMay 11th 2011

I have restarted it.

• CommentRowNumber98.
• CommentAuthorUrs
• CommentTimeMay 11th 2011

It went down right again. I have restarted once more.

• CommentRowNumber99.
• CommentAuthorUrs
• CommentTimeMay 11th 2011

I had to restart the server again (5 min ago).

• CommentRowNumber100.
• CommentAuthorTobyBartels
• CommentTimeMay 29th 2011

I just restarted the server. It required a hard restart.