Everyone wants things to look good. Apple taught the tech industry about the value of design, and that you can charge a substantial premium for it. There are even studies that suggest that products can be made more usable simply by looking better, regardless of the actual ease-of-use, since people will spend more time exploring something that looks good.
The thing with analytics platforms is that they don’t fit into this category. No one uses analytics tools recreationally — they exist to solve a problem. And when you need to solve a problem, you care less about what the solution looks like as long as it works.
Yesterday was Tech Talk Thursday and our stupid-talented Graphic Designer, Keegan Berry, brought some intense HTML5 knowledge. He covered a pretty wide-range of HTML5 topics, including some sweet background info about W3C and WHATWG that illuminated how HTML5 developed to it’s current point, but focused mostly on the current state of HTML5 and some fun new semantic developments.
Want a peek at Keegan’s baller-ass training? We’ve compiled a handy TL;DR list for your knowledge hungry brains.
To coincide with Cloudspace’s release of Rubberband Flamethrower, a benchmarking tool for ElasticSearch, here’s some more helpful info if you’re planning on building search into your app.
ElasticSearch is in the Lucene family, the same as Solr, but the two have different uses. If you’re working on a small app that needs to search less than a million documents and the database isn’t updated more than once every ~15 minutes, Solr is the choice, and a lot of projects fit inside these requirements.
That’s not to suggest that Solr is a junior option. Solr is really fast – faster than ElasticSearch. Solr is used in enterprise search. It’s a definite contender.
Coupling is the extent to which any component of your software depends on other components. When software starts to mature, the coupling decisions neglected early on in development can come back with a vengeance and severely slow down what should be a very simple update. Making all aspects of your code as independent as possible will go a long way in keeping technical debt to a minimum and maintainability to a maximum as the code base matures.
Parameters of Methods
The most common code smell by coupling I see is passing multiple parameters into a method. I prefer to pass in a hash and use the hash keys for my variables in the method.
Congratulations to our friends over at Mashery: Scott Rafer, Kirsten Spoljaric, Oren Michels, and all the rest of the team. They’re the world’s best API management company, with clients like Coca Cola, Best Buy, and Intel. Their customers love them so much that this week, Intel actually decided to purchase the whole company! We’ve been friends with them ever since 2006 when they started operating, and this acquisition is even more confirmation of how great they are!
Both Mashery and Intel are understandably being quiet about the purchase price, but ReadWrite and TechCrunch are stating it’s in the range of $180 million! Regardless of the exact amount, our friends at Mashery are sure to be dancing in the streets.
Rubberband Flamethrower is a rubygem for benchmarking data insertion on Elastic Search servers that I’m proud to release into the open source wilds. If you want to mess around with big data, Elastic Search is definitely a cool tool to play with. First let’s talk a little bit about what Elastic Search is and how to get it set up.
Elastic Search is a open source search and analytics engine built on top of Apache Lucene. It stores data as JSON documents and is managed through a RESTful API. It is designed to be used at high scale and to respond quickly to data queries that would take forever on a traditional database such as MySQL. Elastic Search is designed to grow with your data and clusters automatically reorganize to take advantage of new hardware. It very easy to get a basic Elastic Search node up and running.
This is the first in a set of blog posts about how we use Git and Capistrano in our project workflow at Cloudspace. Everyone can agree that using source control and automated deployment are important for any web project. We work in small teams within Cloudspace or alongside our clients’ engineers, so it’s even more important to have a consistent strategy for maintaining branches and handling deployment to keep everything working smoothly. We generally choose Git for source control and Capistrano for deployment at Cloudspace. We’ve been known to use other tools when a project requires it but if we are starting a new project from scratch we’ll choose Git and Capistrano 9 times out of 10. Git has some great branching features that give it a real leg up on svn and the hosted Git service offered by Github makes it easy to work with other coders the world over. This first post will concentrate on Git and the process leading up to a deploy.
Posted in agile development, best practices, git, professional services, tech
Tagged Branching (software), Capistrano, Commit (data management), git, howto, Merge (revision control), process, Revision control, Server (computing)
Business is moving faster than it used to. This is a cliche, and you’re probably rolling your eyes that I started a blog post with that line. But it’s important to keep in mind, because of what I’m about to tell you.
Because things do move faster than they used to, any opportunity to be faster than the competition is viewed as good, to the point that real-time is what everyone aspires to, with good reason — even faster-than-real-time is coming in a sense.
Knowing what’s going on right now means you can make better decisions, and that’s the whole point of an analytics platform.
One of the biggest reasons why people build analytics platforms is that they have data from different sources, but they don’t want to have to jump between different tools. There are so many SaaS companies now that even small businesses have many data sources
- Google Analytics
- mobile app analytics
- FB Ads
- Twitter Ads
- newsletter data
- new user signups
- A/B testing
Jumping between them all is a real problem. It’s a much better idea to have all the data in a single place.
Here’s the part where a good idea goes too far. Let’s say an engineering team can integrate two new data sources every week. If you’ve got 20 sources, you’re pushing back your launch by more than 2 months!
There are two reasons why people build analytics platforms:
- to collect data that they can’t get any other way
- to run reports on data that’s been collected
When you’re thinking about the types of reports you can run on data, it seems that there are an endless number of ways to slice and dice.
The problem is that you should never ask an engineer for something so that you can query the data “any way we want”. The result is always going to be messy and take a long time to build.