Reallylinux.com 2012 Logo   Website for Beginning Linux Users
Read in 133 countries by over 600,000 users.


  Main Menu
Linux Help
Favorite Links
Our Community

  Site Search
Search our exclusive articles:


  What's Hot




www.reallylinux.com      

Catalyst scales Lamp and Drupal to meet enterprise demands - INTERVIEW www.reallylinux.com

Catalyst scales Lamp and Drupal to meet enterprise demands
An Interview with Mike O'Connor -   Catalyst IT New Zealand

Readers of this article were also interested in: So You'd Like To Use MySQL...

This interview with Catalyst's Director Mike O'Connor sheds light on how enterprise scaling of Lamp and use of CMS including DRUPAL and MOODLE can dramatically benefit companies.

Catalyst IT is a New Zealand based company that began in 1997 when open source technologies, applications and awareness were still in their infancy. They chose open source as their key technology stack as it gave easy access to a wealth of technologies.



     

OPEN SOURCE AS AN ENTERPRISE SOLUTION

REALLYLINUX: Catalyst has been implementing solutions around open source technologies probably longer than most any other IT company in the entire of New Zealand. Can you share some of the highlights/most difficult challenges?

Catalyst's Mike O'Connor: Any services based business goes through good and bad times, and we've been through enough challenging periods in our history to appreciate what we have achieved. By staying committed to our open source values, we have been able to differentiate ourselves in the market, and attract dedicated and highly skilled geeks who share and maintain our passion for open source technologies and the values that surround them.

To spread our message, we have been and always will be very active around issues that affect the freedoms offered by open source.

We look around sometimes and reflect on the fact that Catalyst operates NZ's Electoral Roll, we supply the systems for our national general election, operate NZ's .nz domain name registry, operate our national wagering site, service multiple online media services, and a global list of Moodle,Totara, Drupal, Mahara and Koha clients.

It's only been possible through our choice of open source, and the flow on effects of that decision.



DATABASE SCALING

REALLYLINUX: Mike, can you perhaps speak to the specific critical facets you see for effectively scaling the mid-tier (database) for web sites?

Mike O'Connor: While Postgres is our typical choice for a database back end component, we also use Mysql for some Drupal deployments.

However with respect to scaling a database layer, in both cases the task comes down to careful choice of appropriate hardware (amount of memory, cpu cores, IO subsystem type and layout), followed by tuning various parameters for the database product concerned. Both Mysql and Postgres scale very well with the usual number of cpu cores available in mainstream hardware.

A limited degree of scale-out is possible by means of replication to one or more additional nodes, but currently neither of the two database products above have true "cloudy" - multi master sharded or single master self managing sharded configurations that we can generally use (note our applications are typically not certified with Mysql cluster).

REALLYLINUX: There are many companies facing the prospect of expanding their mid-tier due to scaling and growth. Can you provide a glimpse into the infrastructure that runs large database driven websites like stuff.co.nz? Servers, vms, how do you manage load balancing or issues around scaling a "multi-site" drupal environment?

Mike O'Connor: Catalyst typically deploys high-traffic sites on a multiple-tier architecture, using free software throughout.

The application server tier consists of a number of servers, running whatever software stack is appropriate for the site in question. These application servers are load balanced using an IPVS (http://kb.linuxvirtualserver.org/wiki/IPVS) instance; generally configured on the external firewall.

The back-end database servers will run either MySQL or PostgreSQL, depending on the individual project's requirements. In addition, the back-end tier may also provide additional services like Memcached or Apache Solr.

Finally, in some situations we may deploy a caching tier in front of the application servers. Originally we would use Squid for this caching tier, however for recent deployments we have used Varnish instead, as we find it is far more flexible.



SCALING DRUPAL

REALLYLINUX: Catalyst has developed a renowned expertise around Drupal. In specific context, can you share some of the facets to ensuring that Drupal sites like the ODT.co.nz and SCMP not only handle peak loads now, but can also scale to future needs and adapt to mobile and other platforms? As a follow-up do you see any limitations with Drupal in this space?

Mike O'Connor: Drupal's incredible flexibility, in design, is an architectural priority over performance.

While that makes Drupal capable of many things, often, high performance isn't something that comes out-of-the-box. To improve site performance with Drupal 7 for instance you can apply some of these techniques:

  • By default, Drupal uses its database as a cache store. Under MySQL (or MariaDB), this is just as fast to read and write to as using Memcache. However, when your database is under high load, you don't want that to effect the performance of your cache so separating that out is a good idea. And for the same reasons, whatever you use as your cache handler, don't host it on your database server.

  • Using Drupal's Panels module for page layout and disabling Drupal blocks is a good idea. Panels allows you to set better caching rules around your content than Drupal core allows you to do.

  • Use caching in Views module. Views can spend a lot of time building a SQL query and even more time rendering it. Caching at least the query construction can save you a lot of time. Especially when a page uses a lot of Views.

  • Page level caching is a huge performance win. Boost module works well for medium sites pushing Drupal generated content to filesystem for your webserver (Apache2 supported be default) to attempt to serve before falling back to Drupal. Varnish http accelerator can do a better job smartly handling many connections and serving cache specific to each client type by obeying HTTP standards now also supported in Drupal 7.

Finally, while Drupal can do a lot by itself. You can do more by utilizing your entire technology stack. For example, because Drupal is written in PHP, it has to compile to opcode and bootstrap on each request which is a lot of overhead if your task could be facilitated with smarter webserver configuration, page caching or local browser caching.



DRUPAL LESSONS LEARNED

REALLYLINUX: What is your biggest lesson learned with regard to use/development in the Drupal context? You've also done a lot of work in the Moodle space (with your Totara), can you briefly share what determined your choice with regard to this?

Mike O'Connor: We chose Moodle and Drupal due to their feature sets, and their fit to the technology stacks we were experts with.

Both Moodle and Drupal are large code-bases and application architectures and we accept their architectural and performance designs, good and bad, and just work with them.

They're both fantastic examples of the power of the open source paradigm, with world-wide usage that far exceeds those of all their proprietary competitors. Our focus is using them and inproving them, we have staff associate with each project, and we insist on any improvements or enhancements we make are sent upstream to keep them evolving.



LAMP STACK SCALING

REALLYLINUX: Finally, can you provide some insight into how you scale the LAMP stack (in your case postgres) and issues you've found when dealing with peak volume or fluctuating peaks on infrastructure? How do you manage these variances, or deal with excess bandwidth vs. peak loads?

Mike O'Connor: The biggest lesson learned to delivering great performance on large scale systems (200,000+ users) has been testing our assumptions; finding the bottlenecks and then fixing them. On very large systems bottlenecks can be in strange places, such as IO buses on the servers, networking or in database queries that perform fine on most systems until you put 2 million users on it.

Drupal, Moodle and Totara are all Opensource web applications and run on very similar technologies; therefore as we learn new techniques to optimise for a Drupal system the lessons are often equally applicable for Moodle and Totara. Getting great large scale performance out of these systems requires understanding the underlying technologies so you can architect an optimum system to deliver to the web applications performance requirements.

We have had several instances where customers have come to us with poor performance because their systems were installed on Virtual Servers or cloud infrastructure without taking into consideration important bottlenecks such as DiskIO and CPU contention. Cloud and virtual infrastructure are great solutions, however you need to understand the limitations of infrastructure and networks you are using if your are to guarantee your performance requirements.

Scaling systems such as Moodle, Totara and Drupal requires building from the base up, understanding your performance requirements, infrastructure and then testing your assumptions to implement solutions to overcome the unknown bottlenecks.

REALLYLINUX: Thank you for taking the time to share these details, as we know that your experiences and expertise in Open Source deployment and scaling will benefit many in the community.



For further information regarding the technologies and companies discussed in this interview please see:

http://catalyst.net.nz/

http://drupal.org/

http://www.totaralms.com/

http://www.postgresql.org/

http://moodle.org/



This interview provided courtesy of Catalyst IT Ltd. a New Zealand company, and is published by permission 2012.



updated


We have a complete list of exclusive articles on this full article listing page.

© 2014 Mark Rais & Reallylinux.com. All rights reserved internationally.

Who Are We?     -    Legal Information.     -    Privacy Policy.

Linux is a registered trademark of Linus Torvalds.
All other trademarks and registered trademarks on this entire web site are owned by their respective companies.
This site is not related to or affiliated with any other websites.