|Read this fifth in a series of six blog posts on metadata collaboration from David Loshin: Engaging the Business Users in Enterprise Semantics http://embt.co/1uOejZ7|
|Check out this blog post! Collaboration Principles and Universal Data Model – By Len Silverston, Universal Data Models, LLC http://embt.co/11Cu9dg|
|Check out this blog post by Karen Lopez and then let us know your Big Data Story. http://embt.co/1yPSjMe
I’ve been speaking, teaching and ranting on big data and NoSQL technologies recently. I’ve noticed that when I chat with many data modelers, I’ve met with a lot of skepticism about big data.
You may be that “guy” if you’ve ever said: • It’s just mainframe all over again.
• It’s a fad to get out of data modeling tasks
• It’s not something I need know about
• I don’t have big data, so I don’t need to know NoSQL
I think what’s missing from that thinking is the fact that modern data architectures use technologies that are the best fit for solving a data “problem”. I like to use the term “data story” instead of “problem.” It’s not that the newer technologies are replacing traditional relational database systems; they are complementing them.
Continue Reading here: http://embt.co/1yPSjMe
Understand how ER/Studio provides unlimited access to Enterprise models and metadata collaboration
ER/Studio Team Server Core provides greater meaning, understanding and context to enterprise data through team collaboration on an enterprise glossary of business definitions. This increases the value of enterprise data by giving employees across the organization the ability to use and improve metadata.
ERwin users – Do you struggle with remembering all the options of complete compare? Do you wish your user defined properties could be reused for more than one object type? Do you spend time creating reports instead of choosing from a large set of template reports? Would you rather administer groups of users rather than individual users? Would you prefer to have data lineage extend to ETL workflows feeding your data warehouse? Would you like to load your models from the repository more quickly?
When: Monday, December 8, 2014 1:00 PM – 1:45 PM GMT
Microsoft is a worldwide leader in software, services, and solutions. Founded in 1975, Microsoft is widely known for the Windows operating system and Office suite, but Microsoft’s business is also diversified across cloud computing, video gaming consoles (Xbox), phones, search (Bing), and other technologies.
With a vast range of data needs, Microsoft had implemented different data architecture solutions over time, but it became increasingly clear that a cohesive data management strategy was needed. The lack of clear enterprise data standards at Microsoft fostered extensive variation in how data was modeled across business groups. Furthermore, management had become concerned about end-to-end tracking of customer and partner data. Microsoft could see the necessity and urgency to develop an enterprise data model.
Microsoft performed a thorough review of data modeling tool capabilities prior to making a long-term decision. As a result of this review, Microsoft’s IT department chose ER/Studio because it could offer:
- Flexible partitioning of Microsoft’s extensive data model
- Extensive compare and merge capabilities
- Solid and responsive support interaction
- Standardization functionalities such as naming conventions and metadata
- The ability to consistently define entities for data models across the whole organization
- A flexible and comprehensive macro capability
For Microsoft’s Enterprise Data Architecture team, ER/Studio will be a vital part of a multi-year initiative to build an enterprise data model. ER/Studio gives the team the means to apply a rigorous approach to data modeling as well as the ability to support requirements for a large data model. Data standardization allows for improved data quality and cost-savings as other data architects won’t be starting from scratch to create new models. Finally, having a standard data approach avoids creating incompatibilities and gaps in features between systems.
“Going from a logical model, doing the data design upfront, creating the physical model, and forward-engineering into SQL Server is a great productivity aid.”
— Aaron Hanks, Principal IT Data Architect, Microsoft
Great Article by Cory Janssen
Definition – What does Schema on Read mean?
Schema on read refers to an innovative data analysis strategy in new data-handling tools like Hadoop and other more involved database technologies. In schema on read, data is applied to a plan or schema as it is pulled out of a stored location, rather than as it goes in.
Techopedia explains Schema on Read
Older database technologies had an enforcement strategy of schema on write—in other words, the data had to be applied to a plan or schema when it was going into the database. This was done partially to enforce consistency of data, and that is one of the major benefits of schema on write. With schema on read, the persons handling the data may need to do more work to identify each data piece, but there is a lot more versatility.
In a fundamental way, the schema-on-read design complements the major uses of Hadoop and related tools. Companies want to effectively aggregate a lot of data, and store it for particular uses. That said, they may value the collection of unclean or inconsistent data more than they value a strict data enforcement regimen. In other words, Hadoop can accommodate getting a wide scope of different little bits of data that might not be completely organized. Then, as that information is used, it gets organized. Applying the old database schema-on-write system would mean that the less organized data would probably be thrown out.
Another way to put this is that schema on write is better for getting very clean and consistent data sets, but those data sets may be more limited. Schema on read casts a wider net, and allows for more versatile organization of data. Experts also point out that it is easier to create two different views of the same data with schema on read.
This schema-on-read strategy is one essential part of why Hadoop and related technologies are so popular in today’s enterprise technology. Businesses are using large amounts of raw data to power all sorts of business processes by applying fuzzy logic and other sorting and filtering systems involving corporate data warehouses and other large data assets.