(Replying to PARENT post)
Similarly, using an ORM doesn't help you. All that does is hide the details from you. It still uses SQL underneath, but limits what you can do with it.
Let's talk schemas. You have one whether your database understands it or not. Trying to pretend you don't have one doesn't make managing your schema any easier. It doesn't make building indicies on fields any easier.
These are hard problem and pretending they don't exist by rolling your own implementation or hiding them under an ORM won't make them go away.
(Replying to PARENT post)
There's also no reason why you can't use multiple smaller databases. Use dblink. http://www.postgresql.org/docs/9.1/static/contrib-dblink-con...
I've never had a problem managing "schema and dependencies" on larger systems. Not sure what you are referring to there.
(Replying to PARENT post)
- Compelxity, lock in: The DBMS is like any other software component: If you're actually using it to its capabilities, then it's trivially obvious that it should be difficult to replace, because there are a lot of capabilities you have to find substitutes for. True, you could avoid being coupled to your dependencies by choosing to reinvent wheels whenever possible instead. That would obviously reduce coupling. But bloating your codebase is generally not the best way to make your code more maintainable.
- schema: Pretty much any programming task is hell for people who aren't skilled at it. If you find that you're not very good at using the standard tools of a carpenter, that does not mean that the tools are fundamentally misdesigned. Much more likely, it means that you should be hiring someone else to do your carpentry for you. It's worth pointing out here that programmers shouldn't need to touch the schema in a well-designed system: They should be touching views and stored procedures that abstract away the implementation details.
(Replying to PARENT post)
I view it as an optimization problem. Based upon what you're building, the aforementioned issues are more or less likely to occur. Optimize for what is likely.
If I'm building an official site for a popular product, I'd bet on all the above being true.
Some of what I build, even if 100% of people who could use the product would use it, will run just fine with a single database server that costs in the area of $8k. Planning for scale out is a complete waste of time here.
Most of what I build starts out simple. Most of what I build starts out with 0 users. Will it be popular, will it last, will it need to scale? We have no clue. For these, we keep it simple: modularize our code based upon features, use our datastores to their fullest within features, and keep in mind how we might scale things out (which may ultimately be incorrect) - if we have to. Usually we don't have to. I believe it to be a fair balance in this scenario.
Also, given your constraints, I don't quite understand how you could find a future datastore that is a better solution. How can you exploit better solutions and avoid lockin at the same time? If a better solution comes along in the future, it seems like you would be using features that would lock your datastore to that newer, better solution. However, you cannot do this because you need to avoid lockin.
(Replying to PARENT post)
Damn expensive, eh?
As it happens I bought a couple boxes that size just recently. Except with 256G RAM instead of 96G. They cost $8500 per shot. That's not even 2 months of salary for a single developer.
Don't underestimate Moore's Law.
(Replying to PARENT post)
What happens when someone has to do a SQL update manually? Referential integrity can save you from shooting yourself in the foot there. SQL injection? Much less likely if you're using stored procedures with parameters as your client interface.
You have lock-in all over the place regardless. Languages. Frameworks. ORMs. Operating systems. All are decisions that are painful to change when something "better" comes along. You're better off making those decisions and then leveraging them for all you can get. RDBMSs have been around longer than many languages and most modern frameworks, and are still a very good, probably the best, general purpose solution for data storage and management.
(Replying to PARENT post)
The DBA you refuse to hire would likely disagree.
(Replying to PARENT post)
(Replying to PARENT post)
Redis solves a certain set of problems for me, Memcached as well. ElasticSearch solves another set, and RDBMS yet another. Square pegs for square holes, round pegs for round holes.
(Replying to PARENT post)
I can understand wanting to avoid Oracle lockin.... But Postgres?
(Replying to PARENT post)
(Replying to PARENT post)
(Replying to PARENT post)
When you have the tools you have the power.
(Replying to PARENT post)
- scalability. Logic and processing in the server is bad as it means you can only scale up and not scale out. Scale up is damn expensive. When you need that 64 core 96Gb machine to run it all on with 6 replicas will see what I mean.
- complexity. The DBMS is a black box which sucks up tight coupling almost instantly. Coupling is bad for maintenance and scalability. SQL always ends up with heavy coupling.
- lock in. Those features only work with postgres so you're stuck with it forever or induce a lot of cost to move away when a better solution comes along.
- schema. Maintaining schema and dependencies is hell on all but a small system.
These facts come from 20 years of using databases.
Stuff that will save you: Use your DBMS as a simple flat store akin to a document store. Use a CQRS based architecture and encapsulate ALL of your logic in code. If you have to go at least partially relational for god's sake use an ORM that supports several databases well. Make a provider independent app level backup/restore facility such as the one with JIRA or TeamCity. NEVER hire a DBA - they exist only to create walled gardens to protect their existence.
My current gig has cocked up on all areas and it's costing them over 40% of their turnover keeping it alive.
Happy databasing :)