I’m working on a new rails project that will have a very, very large
database. Initial insert will be in the hundreds of thousands of
records.
The database we are getting the information from is Microsoft SQL based,
and we have access to tab delimited update files daily.
We’re trying to write import scripts and use a well designed database on
our end. The data they are giving us is horribly designed. There are
dozens of fields in the database tables that hold multiple data, there
are several tables that are 90% similar and could be all consolidated
into one table, etc.
I just have a couple questions about normalization and the way we’re
designing the database.
First, as far as server resources go… would normalizing the following
cause higher or lower server loads:
The fields we are getting have multiple records inserted into each
field. For instance, a single field for an option might have “A, E, F,
X, Z” inserted which means that 5 different options are configured for
that selection. We want to have a lookup table in addition to a table
which explains what each option is, but I’m afriad it will add a lot of
computing time to the queries.
Second, the system is going to have a simple CMS, very similar to what
Shopify uses. We’ll allow the creation of both pages and blogs. It
appears that Shopify is using one database table for the pages
and blogs and it’s classifying a blog as a certain type of page,
one which can have it’s own entries (other pages). So all 3, blogs,
pages and blog posts, are all in the “pages” table.
Is this the way to go? It seems simple and elegant to me, but my
programmer thinks it’s confusing and that at the very least blog entries
should be separated into their own table.
I’m not sure what the best way to go is, so I’m open for advice.