(Replying to PARENT post)

I really disagree with this. The more features you use from your DBMS vendor, the more you shoot yourself in the following departments:

- scalability. Logic and processing in the server is bad as it means you can only scale up and not scale out. Scale up is damn expensive. When you need that 64 core 96Gb machine to run it all on with 6 replicas will see what I mean.

- complexity. The DBMS is a black box which sucks up tight coupling almost instantly. Coupling is bad for maintenance and scalability. SQL always ends up with heavy coupling.

- lock in. Those features only work with postgres so you're stuck with it forever or induce a lot of cost to move away when a better solution comes along.

- schema. Maintaining schema and dependencies is hell on all but a small system.

These facts come from 20 years of using databases.

Stuff that will save you: Use your DBMS as a simple flat store akin to a document store. Use a CQRS based architecture and encapsulate ALL of your logic in code. If you have to go at least partially relational for god's sake use an ORM that supports several databases well. Make a provider independent app level backup/restore facility such as the one with JIRA or TeamCity. NEVER hire a DBA - they exist only to create walled gardens to protect their existence.

My current gig has cocked up on all areas and it's costing them over 40% of their turnover keeping it alive.

Happy databasing :)

👤gouranga🕑13y🔼0🗨️0

(Replying to PARENT post)

Throwing out decades of work on storing data just because it's a hard problem won't help you solve the problem. All you are advocating is that instead of using an existing well tested solution, you should write your own and deal with all the assorted costs of redoing that work, and probably poorly.

Similarly, using an ORM doesn't help you. All that does is hide the details from you. It still uses SQL underneath, but limits what you can do with it.

Let's talk schemas. You have one whether your database understands it or not. Trying to pretend you don't have one doesn't make managing your schema any easier. It doesn't make building indicies on fields any easier.

These are hard problem and pretending they don't exist by rolling your own implementation or hiding them under an ORM won't make them go away.

👤tensor🕑13y🔼0🗨️0

(Replying to PARENT post)

I've always found doing logic and processing in the database to be much faster and much more efficient than doing it outside. The closer you can process data to the data store, the faster it goes. Using materialized views can get you a long ways.

There's also no reason why you can't use multiple smaller databases. Use dblink. http://www.postgresql.org/docs/9.1/static/contrib-dblink-con...

I've never had a problem managing "schema and dependencies" on larger systems. Not sure what you are referring to there.

👤joevandyk🕑13y🔼0🗨️0

(Replying to PARENT post)

-scalability: I/O is often much more expensive than logic. With a well-normalized, well-indexed database, pushing logic off to the server often gives the query optimizer a chance to radically reduce how much data is retrieved. That can translate to less load on the disk controller, memory, controller, CPU, and network, and by extension more scalability. Of course, it might not work out that way, depending on the specific case. As usual, metrics rule all and slavishly following what should be rules of thumb at best is the worst kind of premature optimization.

- Compelxity, lock in: The DBMS is like any other software component: If you're actually using it to its capabilities, then it's trivially obvious that it should be difficult to replace, because there are a lot of capabilities you have to find substitutes for. True, you could avoid being coupled to your dependencies by choosing to reinvent wheels whenever possible instead. That would obviously reduce coupling. But bloating your codebase is generally not the best way to make your code more maintainable.

- schema: Pretty much any programming task is hell for people who aren't skilled at it. If you find that you're not very good at using the standard tools of a carpenter, that does not mean that the tools are fundamentally misdesigned. Much more likely, it means that you should be hiring someone else to do your carpentry for you. It's worth pointing out here that programmers shouldn't need to touch the schema in a well-designed system: They should be touching views and stored procedures that abstract away the implementation details.

👤bunderbunder🕑13y🔼0🗨️0

(Replying to PARENT post)

You seem to be coming a very specific viewpoint. Not all things you build will require scaling out. Not all systems will become as complex as you seem to be envisioning. Not all systems will be around in 20 years, much less require the types of changes in 20 years that necessitate changing the datastore.

I view it as an optimization problem. Based upon what you're building, the aforementioned issues are more or less likely to occur. Optimize for what is likely.

If I'm building an official site for a popular product, I'd bet on all the above being true.

Some of what I build, even if 100% of people who could use the product would use it, will run just fine with a single database server that costs in the area of $8k. Planning for scale out is a complete waste of time here.

Most of what I build starts out simple. Most of what I build starts out with 0 users. Will it be popular, will it last, will it need to scale? We have no clue. For these, we keep it simple: modularize our code based upon features, use our datastores to their fullest within features, and keep in mind how we might scale things out (which may ultimately be incorrect) - if we have to. Usually we don't have to. I believe it to be a fair balance in this scenario.

Also, given your constraints, I don't quite understand how you could find a future datastore that is a better solution. How can you exploit better solutions and avoid lockin at the same time? If a better solution comes along in the future, it seems like you would be using features that would lock your datastore to that newer, better solution. However, you cannot do this because you need to avoid lockin.

👤sbov🕑13y🔼0🗨️0

(Replying to PARENT post)

Scale up is damn expensive. When you need that 64 core 96Gb machine to run it all on with 6 replicas will see what I mean.

Damn expensive, eh?

As it happens I bought a couple boxes that size just recently. Except with 256G RAM instead of 96G. They cost $8500 per shot. That's not even 2 months of salary for a single developer.

Don't underestimate Moore's Law.

👤moe🕑13y🔼0🗨️0

(Replying to PARENT post)

If you are paying for your RDBMS (e.g. Oracle, SQL Server) and not using its features you are throwing your money away. Even if you are using a "free" database, if you're just tossing aside things like data types, permissions, referential integrity and check constraints, natural declarative handling of sets, and writing all that in your client application what happens when you want to put another front end on your data? You have to do all that again. Where's the hell then?

What happens when someone has to do a SQL update manually? Referential integrity can save you from shooting yourself in the foot there. SQL injection? Much less likely if you're using stored procedures with parameters as your client interface.

You have lock-in all over the place regardless. Languages. Frameworks. ORMs. Operating systems. All are decisions that are painful to change when something "better" comes along. You're better off making those decisions and then leveraging them for all you can get. RDBMSs have been around longer than many languages and most modern frameworks, and are still a very good, probably the best, general purpose solution for data storage and management.

👤ams6110🕑13y🔼0🗨️0

(Replying to PARENT post)

"The DBMS is a black box"

The DBA you refuse to hire would likely disagree.

👤sciurus🕑13y🔼0🗨️0

(Replying to PARENT post)

KV stores are painful to use. Only huge applications that are DB-bound have the problem of scaling database and they usually have a lot of money. Also you don't need your replicas to be as strong as your master db server.

👤IsTom🕑13y🔼0🗨️0

(Replying to PARENT post)

Don't confuse the "RDBMS uber alles!" for all storage is equivalent to "RDBMS is always a fail." Certainly over the last decade or so there was a mistaken belief that all data should be in an RDBMS, even though they were flattened or required no relational algebra, but that doesn't discount the benefit of a tool that can handle such relations when necessary.

Redis solves a certain set of problems for me, Memcached as well. ElasticSearch solves another set, and RDBMS yet another. Square pegs for square holes, round pegs for round holes.

👤fallous🕑13y🔼0🗨️0

(Replying to PARENT post)

Huh? Your worried about complexity and lock in, so you want devs to replicate tried and true database features?

I can understand wanting to avoid Oracle lockin.... But Postgres?

👤Duff🕑13y🔼0🗨️0

(Replying to PARENT post)

I agree with you but I often question when people list "lock in" as an issue. How often do you see projects where they decide to move from one db to another? Even in cases where an ORM is used, making the migration process painless in theory, I doubt many projects take the risk and exercise their freedom of not being locked in.

👤platonichvn🕑13y🔼0🗨️0

(Replying to PARENT post)

On the other hand, SQL lets you write concise and efficient queries that run without lots of round-trips between the db and whatever machine you're accessing it from. ORMs force you to write less expressive code that is less efficient, hardly a winning combination.

👤it🕑13y🔼0🗨️0

(Replying to PARENT post)

- schema. Maintaining schema and dependencies is hell on all but a small system

When you have the tools you have the power.

👤kfool🕑13y🔼0🗨️0

(Replying to PARENT post)

Being "locked in" to an opensource database doesn't strike me as some kind of unbearable death trap.

👤jacques_chester🕑13y🔼0🗨️0

(Replying to PARENT post)

I disagree with everything you say here. I seriously doubt you've been using databases for 20 years. Which databases have you used? How many hours experience do you have with each of them? Just because some people do it wrong doesn't mean you should throw out the baby with the bath water! In every stack I've used the database has been the most robust and amazing piece of technology and it's been an absolute dream come true for every project I've been involved in. (SQL Server / PostgreSQL - 10,000 hours experience)

👤lwat🕑13y🔼0🗨️0