tag:blogger.com,1999:blog-3346090548501966296.post889116800234237752..comments2024-03-26T05:03:04.875-07:00Comments on Perspectives on LedgerSMB: Math and SQL, part 2: Functions and First Normal FormChris Travershttp://www.blogger.com/profile/06211762965865744803noreply@blogger.comBlogger13125tag:blogger.com,1999:blog-3346090548501966296.post-4282481653466342212014-08-14T23:26:30.441-07:002014-08-14T23:26:30.441-07:00Sorry, when I read your comment about the triangle...Sorry, when I read your comment about the triangle, I had falsely read it as if you were treating triangles differently than other types like strings or numbers. But then I re-read your post eg "Thus the whole tuple can be said to be a candidate key, and every member a trivial function of it." and so yes in that sense you are treating the triangle the same as the other types.Darren Duncanhttps://www.blogger.com/profile/14457831481137852460noreply@blogger.comtag:blogger.com,1999:blog-3346090548501966296.post-74407685201767683372014-08-12T00:05:07.560-07:002014-08-12T00:05:07.560-07:00Ok, I am having issues posting this (third time is...Ok, I am having issues posting this (third time is a charm?)...<br /><br />Your triangle is at least a function of the tuple. If not, then you have broken 1NF. That is a trivial but important functional dependency. It could also be dependent on a secondary key, but it will be functionally dependent on the tuple as a whole. The each triangle's vertex, however, is *not* functionally dependent on the tuple as a whole and so seeing this as a collection of points instead of a triangle breaks 1NF.Chris Travershttps://www.blogger.com/profile/06211762965865744803noreply@blogger.comtag:blogger.com,1999:blog-3346090548501966296.post-91517095639991501932014-08-11T02:39:48.506-07:002014-08-11T02:39:48.506-07:00Why does the triangle have to be functionally depe...Why does the triangle have to be functionally dependent on anything else just because its a triangle? If anything, my use case example has it that the triangle valued attribute IS the primary key of the relation, as in that use case the relation is a set of triangles with information about each distinct triangle, so the other tuple attributes are dependent on the triangle, not the reverse, in my example.Darren Duncanhttps://www.blogger.com/profile/14457831481137852460noreply@blogger.comtag:blogger.com,1999:blog-3346090548501966296.post-3502716209874862222014-08-11T02:38:21.208-07:002014-08-11T02:38:21.208-07:00This comment has been removed by the author.Darren Duncanhttps://www.blogger.com/profile/14457831481137852460noreply@blogger.comtag:blogger.com,1999:blog-3346090548501966296.post-45525012137211238992014-08-10T20:56:47.954-07:002014-08-10T20:56:47.954-07:00As I think about it, regarding set-represented tri...As I think about it, regarding set-represented triangles and 1NF, the interesting thing is that the triangle, represented as a set or not, is still functionally dependent on the tuple or candidate key. The tuple has a definite number of elements (3) and all possible orders are equivalent.<br /><br />This gets to your point of where the set is a complete representation of a single value (which happens to be represented by a finite set). This is where type encapsulation in PostgreSQL becomes really handy.Chris Travershttps://www.blogger.com/profile/06211762965865744803noreply@blogger.comtag:blogger.com,1999:blog-3346090548501966296.post-44451258189184603732014-08-10T18:03:14.418-07:002014-08-10T18:03:14.418-07:00Yes, I agree with everything as you say here, and ...Yes, I agree with everything as you say here, and that limiting 1NF to "scalars" is illogical. As to normalizing polygons, I can think of a simple way, which is to use distance from the origin [0,0,...]; say the first point is the one closest to or on the origin, and then out of the 2 other points it connects to, the one second-closest to the origin is considered the second point, and then we go in that direction. I speak of normalization in terms of ordering a structure so that one can simply compare 2 values for equality/identity by naive element-wise comparison. Likewise, Unicode normal forms or the like do that for text, for some sense of equal (eg, canonical decomposed). By putting more logic into normalizing value representation, equality tests between 2 values can be made trivial.Darren Duncanhttps://www.blogger.com/profile/14457831481137852460noreply@blogger.comtag:blogger.com,1999:blog-3346090548501966296.post-87373929763266743272014-08-10T17:44:12.416-07:002014-08-10T17:44:12.416-07:00This conversation is really interesting. It confi...This conversation is really interesting. It confirms my sense that 1NF is one of those topics that people gloss over but it deserves a lot more coverage.Chris Travershttps://www.blogger.com/profile/06211762965865744803noreply@blogger.comtag:blogger.com,1999:blog-3346090548501966296.post-32196893850475171782014-08-10T17:41:37.809-07:002014-08-10T17:41:37.809-07:00Think about the danger of "only scalar values...Think about the danger of "only scalar values are allowed." Where do we stop? Are datetime fields a 1NF violation (since a datetime can be thought of as a (date,time) couple as well as a scalar offset from a specific point in time? Can you store Heisenberg Matrices in a relational database? What about Dirac Vectors? It would be truly ironic if relational databases, based on math, couldn't be used in math and physics.<br /><br />Polygons are interesting here. Ideally one would normalize not only for direction but also starting point, but that might be difficult. These are all cases where PostgreSQL's type encapsulation capabilities are really helpful.Chris Travershttps://www.blogger.com/profile/06211762965865744803noreply@blogger.comtag:blogger.com,1999:blog-3346090548501966296.post-7736882138419979092014-08-10T17:14:21.153-07:002014-08-10T17:14:21.153-07:00Chris, putting aside matters of sets and 1NF for t...Chris, putting aside matters of sets and 1NF for the moment, I very much agree with your consideration that it does not violate 1NF to have arrays or such ordered data as tuple elements. Indeed, it is much easier to find many examples where an ordered collection of elements is best treated as a single value and should be valid 1NF as a tuple element. The general case of polygon definitions for example (especially if normalized for direction). And in fact, it would be much easier to argue against any proposal to say that ordered collections can never be 1NF. Especially if framed with language like "only scalar values are allowed". After all, if one says that you can only have 1NF if everything is scalars, then I would ask, what is a character string? That is certainly not a scalar in the strictest sense, but rather an ordered list of elements. So it would be hypocritical to deny "non-scalars" but yet allow strings.Darren Duncanhttps://www.blogger.com/profile/14457831481137852460noreply@blogger.comtag:blogger.com,1999:blog-3346090548501966296.post-8584031283995879602014-08-10T07:55:19.517-07:002014-08-10T07:55:19.517-07:00That's an interesting point as well. Interest...That's an interesting point as well. Interestingly although this does not meet the math definition of a 1NF violation, it is isomorphic with one and poses most of the same problems.Chris Travershttps://www.blogger.com/profile/06211762965865744803noreply@blogger.comtag:blogger.com,1999:blog-3346090548501966296.post-66301204525670701812014-08-10T07:46:50.247-07:002014-08-10T07:46:50.247-07:00The triangle point is actually very interesting. ...The triangle point is actually very interesting. It isn't that you have a set, but rather that you have a value which is treated as a set for some purposes (i.e. two triangles are identical if and only if their sets of vertices are identical). Interestingly when you move to polygons with more than three sides, you can't reduce equality to a set operation, and therefore an SQL array of points is actually a better model since ordinality actually is important.Chris Travershttps://www.blogger.com/profile/06211762965865744803noreply@blogger.comtag:blogger.com,1999:blog-3346090548501966296.post-14877167154255007762014-08-09T23:10:38.275-07:002014-08-09T23:10:38.275-07:00Replying to myself ...
Actually, besides allowing...Replying to myself ...<br /><br />Actually, besides allowing duplicate rows, there is 1 major practice that I consider a 1NF violation, and that is when you have multiple tuple elements with interchangeable function, for example, having separate fields named phone_1 and phone_2 or address_1 and address_2, and many similar, where there is no semantic meaning (eg, preference) between the 2+fields.<br /><br />So the above example I would consider the most common kind of 1NF violation, that and duplicate rows. Whereas, a single field containing a value that is a set, I would not consider a 1NF violation, even if it otherwise is not the best choice of design for that particular database. So having a field named "phone_numbers" to replace phone_1 .. phone_2 I would consider as coming into 1NF compliance.<br /><br />Darren Duncanhttps://www.blogger.com/profile/14457831481137852460noreply@blogger.comtag:blogger.com,1999:blog-3346090548501966296.post-31978105758345665932014-08-09T22:53:08.318-07:002014-08-09T22:53:08.318-07:00Chris, thanks for referencing my Set::Relation mod...Chris, thanks for referencing my Set::Relation module, I'm glad it was useful for your article.<br /><br />Speaking to your definition of First Normal Form, where no tuple contains a set, I think that having a set here is a good practice when the set as a whole represents a single value and has an identity.<br /><br />For example, a value representing a triangle can be represented by a set of 3 points, or a line segment by a set of 2 points. It would make sense to represent said value as a single set value (encapsulated or otherwise) in a single tuple, and not split it over 2 or 3 tuples. Remember, the points are NOT mutually ordered.<br /><br />Another point is that there are certain logical things that simply can't be represented at all in a natural way without having set values in tuples. For example, if you want to have a primary key over a set of tuples on elements that are themselves sets. I can point to the same example again, say you want to have a tuple per each triangle or line segment, then the set representing the triangle or line needs to have a key constraint on it, saying each set is distinct from every other set, and the other tuple attributes have a functional dependency on the set value that is the triangle or line segment's identity.<br /><br />Note that any natural solution to the use case I mentioned has no surrogate keys by definition.<br /><br />As a next point, the relational model works perfectly well from a logical standpoint when tuple elements are of any type at all, including other relations or sets or tuples etc, and I would argue there is no violation of first normal form occurring here, rather the only violation is where duplicate tuples are allowed.<br /><br />In fact, practice would require having set-valued elements in tuples. What form do you think you have when you do a GROUP BY on a relation? A relation whose tuples each have relation-typed elements, is what you have. This nested relations is the input for any aggregate operations you would subsequently apply, that then map each inner relation to simpler values.<br /><br />I grant you that in practice most of the time in stored relations your key tuple elements would be of simpler types like numbers or strings, but I would not say that having more complicated values as tuple elements is a violation of 1NF.<br />Darren Duncanhttps://www.blogger.com/profile/14457831481137852460noreply@blogger.com