Recent tech issues
My name's Dick, and I'm the project manager for TrekEarth. I'm not new here. In fact, I've been working on TrekEarth since March of 2009-- before the redesign, when Adam was handling all of the tech for the site. I worked with Adam on the relaunch, the migration to the new forum, and just about everything since.
Before I address the recent tech problems, I'd like to address my absence on the forums. Back when Adam was around, my presence wasn't really needed, as the community knew Adam and he did a great job communicating with you. When Adam left, Steph did an amazing job of stepping in engaging the community. We decided that our time was better spent with me managing the tech side of the site, and her communicating the issues to me.
While Amber continues to do an amazing job of being a conduit between me and the community, I believe the recent tech issues warrant a response directly from the tech team, hence this post.
When the site was migrated to our hosting (a long time ago), it was set up using an architecture mirroring the old host. While this architecture isn't terrible, it has some limitations and is not our standard configuration. We have had a plan to move this to a more modern configuration for some time, but have felt our efforts were better spent fixing bugs. Over the last year, we've made preparations, but stayed short of an actual migration to new hardware and updated software.
Because of this outdated and unfamiliar environment, we've had considerable difficulty working with the code and been limited in our ability to mitigate problems regarding scalability. Many of the bugs that have appeared over the last year have been due to these issues. It seems like every time we fix something, two other things break-- two more things that are, once again, difficult to track down because of the current technology.
One specific problem that we were having was with the database. It ran an outdated version of mysql that our developers are less familiar with. Many of our performance enhancements don't work on that version, so when the site becomes slow, we aren't able to address the issues as quickly as we'd like.
As to the most recent database errors, I'm sorry for the problems. There was a poorly written query that had been running on the site for a long, long time. It had always been a problem, but last weekend it reached its head. when it eventually hit the breaking point, and started crashing the site. With the outdated version of mysql, we couldn't use our standard diagnostic tools to track it down. Since the site was constantly crashing, I made the decision to migrate the database to a new server with updated software.
While this is usually not a monumental task, this specific migration posed significant challenges. First, since the site was crashing, we had to make the migration on short notice. This means that we didn't have the time to do our usual preparations and planning. Furthermore, we didn't have time to do a test upgrade to look for errors. We were forced to do all of this on the fly, and like anytime you operate in this manner you'll have problems. Some of these problems, are still being actively addressed by me and my developers.
One of the oversights was the beta side of the site. When we switched the www side of the site over to the new database, we didn't realize that the database server's IP was specified in a different manner on the beta side of the site. This means that, for a period, the beta side of the site was using the old database. When this was corrected, all of the content that was added to the site through the beta interface was lost. I apologize for this. Furthermore, this afternoon we realized that the beta version of the forum was still using the old database. Again, when we corrected this, we lost all content that had been added through the beta side of the forum.
It is because of this added complexity that we never intended on maintaining the beta version of the site. Still, it was absolutely not our intention for these events to happen. We did not purposefully neglect the beta side of the site. We were simply reacting the best we could to a bad situation.
The good news is that with the exception of a few lingering issues, we are now using a more modern version of the database software. This has not only allowed us to fix the problematic query, it has allowed us to fix other problematic queries and perform some much needed optimizations. From the database side of things, we're in a much more stable condition.
From the web server side of things, we're still in a less than optimal situation. We don't have the ability with the current configuration to scale horizontally, which is what we need to do. The recent problems have demonstrated that we need to increase the prioritization of this, and this time we will have the time to properly plan this out.
Over the next week, we will be making a number of changes to the application servers to the site. We will be adding more power and redundancy to the system. During this period, we might have some outages. I intend to give fair warning here on the forums when we expect these outages, and we will do our best to minimize them. We understand that it's been a rough ride lately, and we don't want to make it any worse. We are working to make things better.
I would like to apologize for the inconveniences that the recent tech issues have caused, and we appreciate your patience while we continue to work through them.
Project Manager -- Internet Brands
I'm getting a "Invalid multi photo query" message when clicking on subsections like east or west within country sections. Is this related to the larger tech issues?
Thanks for the detailed update Dick. I suspect much of the frustration felt by the membership was due to the lack of any kind of feedback / information on what was going on. Your post, I believe, will go a long way to addressing those frustrations and concerns.
I'd better add that I'm posting this purely as a long term and fairly active member and not looking for brownie points in my mod role.:)
There is a problem this morning, at 0730 Montreal time, I cannot upload a photo. It is saying that I already uploaded a photo for today.
I uploaded a photo on November 09 not on November 10.
It seem the site did not update itself since yesterday.
I have the same issue - I uploaded my last photo yesterday morning and if I try to upload one now it tells me that "you have reached your limit for today".
besides, TE was down this morning for hours, until 30 minutes ago - are you working on the server? maybe something went wrong?
I would appreciate a notice to inform a server outage - I didn't see any yesterday...
I cannot upload a photo either. Tells me I have to wait 20 hours now (10:55 EST)
Photos on the web site?
Just wondering, I used to be able to click on any picture on this website to be able to see it, in a bigger format. Unfortunately, I do not knoww hy, but today I could not?
Thank you and have agood day!
I am unable to reproduce this problem. When I click on the first photo in the gallery I am taken to the correct page with the full sized image. I tested in IE, FF and Chrome: http://www.trekearth.com/gallery/Eur...oto1259829.htm
What images are not coming up for you? Is there an error when you click on it? Please provide details.
During the last 2-3 days, when I click on a thumbnail, I cannot open up the photo.
We didn't used to encounter such problems in the old version of Trekearth. It was very very functional.
More Techy Problems
Looks like you poor guys are being inundated.
For a number of days, I could not open any photos, but that problem seems to have ceased.
But there are one exception:
One of my own photos cannot be opened at all by me or anyone else:
Could you please let me know if this is temporary or likely to be a permanent bug?
|All times are GMT. The time now is 03:00 AM.|