Data management types of approach to data management. Data management system


Data management is the basis of database administration.

Basic concept of data management.

Organization of data management.

Database administration.

Conclusion.

Data management - the basis of database administration

Data management includes data processing processes, from data collection to archiving and delivery to users. At the same time, both technological and organizational issues of data collection and processing are considered. Database administration is the data management component associated with the DBMS.

Data management can be considered at the data source, data center, project (program) level. Each level may include previous levels of data management. For example, data management at the center level necessarily involves collecting data from data sources. A large scientific program may involve multiple experiments, each of which may have its own data management plan.

A data management plan is an organizational document that defines all stages of data processing, as well as the means for their implementation.

The goals of creating a data management plan are to improve the collection, access and use of information; database development; standardization of data collection and exchange procedures.

Basic Data Management Concept

Creating a data management plan must take into account long-term decisions about

    development and standardization of common technologies for collecting and exchanging data, allowing to reduce the time lag between collection and access to data;

    increasing cooperation in collecting, archiving, processing and mapping data;

    creating distributed databases;

    combining new and historical data to obtain relevant time series;

    database compatibility through the use common protocols formatting and quality control for individual disciplines;

    access to archived data.

The data management methodology should be based on the use of the most effective means:

    creating multi-level data catalogs;

    using catalogs to search and evaluate duplicates;

    search and exchange of data;

    converting data into common formats;

    data control at various stages of data processing;

    creating new data processing methods;

    access to data on CDs, Internet, etc.

The data management plan promotes a better understanding by all project participants of the pooling of scientific interests, public needs, and legal issues. Data management begins with the design of a measurement program for an expedition or project, the creation of a database, and ends with user access to high-quality controlled and well-documented databases. A data management plan should be a key element of all major projects and programs. A data management plan will help maximize the return on investment made in the project by providing funding to make full use of the resulting data, i.e. The data management plan is the mechanism for disseminating and using the results of the project, a specific activity carried out within the framework of national and international or corporate policies based on best data management practices.

This plan should describe the work, technology requirements and associated results in measurement activity design, data collection reporting, documentation, quality control and database creation, data access.

One of the main tasks of any project, and especially a data center, is the creation of metadata databases. Common approaches to data management allow for benefits both for the specialists working in these projects and for society as a whole (faster use of data); make more efficient use of most data sources; well document and control the data intended for general use upon completion of the project.

Adequate data management is determined by the capabilities of national organizations, political aspects, technical problems, project financing conditions, good coordination of all project participants, and the availability of appropriate qualified staff.

Logical level (formalized/model description)

The logical level of information technology is represented as a complex of interconnected models that formalize information processes during the transformation of information into data. A representation of information technology formalized in the form of models makes it possible to link the parameters of information processes and makes it possible to implement the management of information processes and procedures. In Fig. 2.12 is given logic model basic information technology, which reflects the interconnection scheme of information process models.

Based on the model subject area, characterizing the control object, is created general model management, which, in turn, forms models of the tasks to be solved. Since various information processes are used to solve management problems, it is necessary to build a model of their organization, which at a logical level links the management processes used in solving problems.

Rice. 2.12.

When processing data, all basic information processes are formed: processing, exchange and accumulation of data, and the presentation of knowledge.

Data processing model includes a formalized description of the procedures for organizing the computational process ( OS), transformation (algorithms and programs for sorting, searching, creating and transforming static and dynamic structures) and logical inference (modeling).

Data exchange model contains a formal description of the procedures performed in a computer network: transmission (coding, modulation in communication channels), switching and routing (network exchange protocols) and is described using international standards: OSI (Open Systems Interconnection), local networks (IEEE 802) and specifications Internet (see Chapter 18).

Data accumulation model describes both the database management system (DBMS) and the information base itself, which can be defined as a database and a knowledge base. The process of transition from semantic (information) representation to physical is carried out by a three-level system of models information base: conceptual (what and to what extent information should be accumulated during the implementation of information technology), logical (structure and relationship of information elements) and physical (methods of placing data and accessing it on computer media). Database management functions regulate (see Chapter 19): database language SQL data(Structured Query Language); information and reference system IRD (Information Resource Dictionary System); protocol remote access operations RDA (Remote Data Access), PAS (Publicly Available Specifications) Microsoft to open application interface access to ODBC (Open Data Base Connectivity) API (Application Program Interface) databases.

Knowledge representation model is selected depending on the completeness of reproduction and content of the subject area, as well as the type of problems being solved. Currently, knowledge representation models such as logical, algorithmic, semantic, frame and integral are used.

Information acquisition model is built taking into account standards governing data and document structures, as well as data formats:

  • o means of the ASNl (Abstract Syntax Notation One) language, intended for the specification of applied data structures - the abstract syntax of applied objects;
  • o metafile formats for presentation and transmission graphic information CGM (Computer Graphics Metafile);
  • o specifications for messages and electronic data for electronic interchange in government, commerce and transport EDIFACT (Electronic Data Interchange for Administration, Commence and Trade);
  • o specifications of documents and their structures ODA (Open Document Architecture);
  • o specifications of document structures for production, for example SGML (Standard Generalized Markup Language);
  • o languages ​​for describing hypermedia and multimedia documents, for example: HyTime, SMDL (Standard Music Description Language), SMSL (Standard Multimedia/Hypermedia Scripting Language), SPDS (Standard Page Description Language), DSSSL (Document Style Semantics and Specification Language), HTML ( HyperText Markup Language);
  • o specifications for graphics data formats, e.g. JPEG formats, JBIG and MPEG.

The information display model is built taking into account the standards X Windows, MOTIF, OPEN LOOK, VT, CGI, PHIGS, GKS computer graphics, graphic user interface GUI

Information, data and knowledge management models link basic information processes and synchronize them at a logical level.

Since basic information processes operate with information, data and knowledge, then management information occurs through the processes of receiving (collection, preparation and input) and display (building graphics, text and video, speech synthesis); control data is carried out through the processes of processing (management of the organization of the computational transformation process), exchange (management of routing and switching in a computer network, transmission of messages over communication channels) and accumulation (database management systems), and management knowledge - through knowledge representation (management of knowledge acquisition and generation).

Physical layer (software and hardware implementation)

The physical level of information technology represents its software and hardware implementation. On physical level information technology is considered as a system consisting of large subsystems: processing, exchange, accumulation of data, obtaining and displaying information, representing knowledge and managing data and knowledge (Fig. 2.13). With a system that implements information Technology at the physical level, the user and the system developer interact.

Rice. 2.13.

Data processing subsystems are built on the basis of electronic computers different classes and differ both in computing power, and in terms of performance. Depending on the needs of the tasks being solved, both large general-purpose computers (mainframes) are used to process enormous amounts of information, and personal computers(PC). The network uses both servers and clients (workstations).

Data exchange subsystems include complexes of programs and devices (modems, amplifiers, switches, cables, etc.) that create computer network and performing switching, routing and access to networks.

Data accumulation subsystem implemented using bikes and databases on external devices the computer that controls them. Possible organization as local databases and banks, implemented on individual computers, and distributed data banks using computer networks and distributed processing data.

Receiving Subsystems , information display and knowledge representation used to form a model of the subject area from its fragments and a model of the problem being solved. At the design stage, the developer forms a set of models of the problems to be solved in the computer memory. At the operation stage, the user accesses the information display and knowledge representation subsystem and, based on the task at hand, selects the appropriate solution model, after which other subsystems are activated through the data management subsystem.

Data and knowledge management subsystem , as a rule, is partially implemented on the same computers on which the corresponding subsystems are implemented, and partially using computer process management systems and database management systems. When there are large flows of information, special network and database administrator services are created.

Definition

Data Management is a comprehensive series of procedures to be followed and to develop and maintain quality data using technology and available resources. It can also be defined that it is the execution of architectures according to certain predefined rules and procedures to manage the complete life cycle company or organization data. She takes on all disciplines, d & # 39; related to data management resources.

The following are the key steps and procedures or disciplines of data management:

1. Database management system

2. Database administration

3. Data storage

4. Data Modeling

6. Data security

7. Data movement

8. Data architecture

9. Data analysis

1. Database management system:

This is one of the comp & # 39; computer programs various types and brands available these days. These programs are designed specifically for data management. These are just a few; Ms. Access, MsSQL, Oracle, My Sql, etc. The choice of any of them depends on company policy, experience and administration.

2. Database administration:

Data Administration is a group that is responsible for all aspects of data management. Roles and Responsibilities & # 39; The liaisons of this team depend on the company for all database management policies. They implement systems using protocols software procedures to support the following properties:

a. Development and testing database

b. Database Security

c. Backups Database

d. Database integrity and software

e. Executing the Database

f. Ensuring maximum database availability

3. Data storage

a data warehouse, in other words, is a system for organizing historical data, its capacity, etc. In fact, this system contains the raw materials for managing query support systems. This raw material is such that analysts can obtain any type of historical data in any form, such as trends, tagged data, difficult questions and analysis. These reports are important for any company to view its investments or business trends, which in turn will be used for future planning.

The data warehouse is based on the following terms:

a. Databases are organized in such a way that all data elements d & # 39; associated with the same events, d & # 39; connected with each other

b. All changes to databases are recorded for future reporting

c. Any data in the databases is not deleted or not printed, the data is static, read only

d. The data is sequential and contains all organizational information.

4. Data Modeling

Data modeling is the process of creating a data model by applying and model theory to create an instance of the data model. Data modeling is actually defining, structuring and organizing data using a predefined protocol. These structures are then implemented in a data management system. In addition, this will also be hampered by some restrictions in the database structure.

5. Ensuring data quality

Data quality is a procedure that will be implemented in data management systems to remove anomalies and inconsistencies in databases. It also performs database cleaning to improve the quality of the databases.

6. Data security

It is also called data security, it is a system or protocol that is implemented in the system to ensure that the databases are stored completely securely and no one can be harmed by using access control. On the other hand, data protection also ensures the privacy and protection of personal data. Many companies and governments around the world have created privacy laws.

7. Data movement

This is one concept, widely d& # 39; associated with a data warehouse that is ETL (Extract, Transform and Load). ETL is a process involved in data warehouses and this is very important as the data is loaded into the warehouse.

8. Data architecture

This the most important part data management systems; it is a procedure for planning and defining target data states. This is by understanding the target state by describing how data is processed, stored and used in any given system. It created the conditions for processing operations that allows the creation of data flows and controls the flow of data in any given system.

Basically, the data architecture is responsible for excluding target states and alignments during initial development and is then supported by the implementation of minor observations. When dismantling the states, the data architecture is broken down into minor sub-layers and parts, and then purchased to the desired shape. These layers can be created with three traditional data architecture processes:

a. Conceptual, which represents all subjects & # 39 objects of economic activity

b. Logical means how these commercial structures are & # 39; related.

c. Physical is the implementation of the data engine for a specific database function.

From the above statements, we can determine that data architecture includes a complete analysis of the relationship & # 39; relationships between functions, data type and technology.

9. Data analysis

Data analysis is a series of procedures that are used to extract necessary information and reporting findings. Depending on the type of data and query, this may involve the application of statistical methods, trends, selection, or rejection of certain subsets of data based on specific criteria. In fact, data analysis is verification or assertions. existing model data or extracting the necessary parameters to achieve a theoretical model over reality.

Data mining is the procedure of obtaining unknown but useful data parameters. It can also be defined as a series of procedures to extract useful and unwanted information from large databases. Data mining is the principle of sorting big things thanks to a large number data and the selection of appropriate and necessary information for any particular purpose.

Data management is a process that involves collecting, storing, processing and interpreting accumulated data. Today, for many companies, data management is an excellent opportunity to understand the data that has already been collected, “recognize” competitors, build predictive analytics (forecasting), and answer many business questions.

Data management

What does data management include? Let's list the main processes:

  • Database Management
  • ETL processes (data extraction, transformation and loading)
  • Data collection
  • Data protection and encryption
  • Data Modeling
  • Data analysis itself

Based on the above, it becomes clear that for successful data management it is necessary:

  • Decide technical issues(select a database, determine where the data will be stored - in the cloud, on a server, etc.)
  • Find competent human resources :)

Key challenges in data management

Among the most common errors and difficulties that arise when collecting, storing and interpreting data are:

  • Incomplete data
  • “Doubling” of data (and often contradicting each other)
  • Outdated data

A product like , which helps connect data from different sources, enrich it and prepare it for use in Business Intelligence systems, can help with many issues at the data loading stage.

Data analysis

Do you already have a suitable amount of necessary and important data? Now, in addition to storing them, they need to be analyzed. Data analysis will help answer many business questions, make informed decisions, “see” your customer, and optimize warehouse and logistics processes. In general, data analysis is important and needed in any field, any company, at any level.

The data analysis solution consists of three main blocks:

  • Data store;
  • ETL procedures (data extraction, transformation and loading);
  • Reporting and visual analytics system.

All this seems quite complicated, but in reality it’s not all that scary.

Modern analytical solutions

What should companies do that do not have a staff of analysts? And there is no programmer-developer? But there is a desire to do analytics!

Of course there is a solution. Nowadays there are enough automated systems for analytics on the market and – what’s important! – visualization of your data.

What are the advantages of such systems (type):

  • Ability to quickly implement (download the program and install it on your laptop at least)
  • No need for complex IT or mathematical knowledge
  • Low cost (from RUB 2,000 per month for a license as of March 2018)

So, implement this analytical product Any company can: it doesn’t matter how many employees work in it. Tableau is suitable for both individual entrepreneurs and large companies. In April 2018, the UN selected Tableau as the analytics platform for all its offices around the world!

Companies that work with such automated systems analysts note that tabular reports, which previously took 6 hours to build, are collected in Tableau literally in 10-15 minutes.

Don't believe me? Try it yourself - download free trial version Tableau and receive training materials on working with the program:

Download Tableau

Download for FREE full version Tableau Desktop, 14 days and receive Tableau business analytics training materials as a GIFT

  • Digital transformation was the main topic of discussion at the recent 22nd IT in Insurance conference. Participants agreed that the implementation and application analytical technologies for data monetization has already become prerequisite for qualitative transformations in the industry.
  • IoT is powerful source data that, when combined with analytics, can provide insights into everything from behavior to emotion to health. And that's why it's key to improving the customer experience.
  • Develop a data management strategy using Data Lineage and empower AI to reach its full potential.
  • There is no single plan for working on a data analytics project. Technology expert Phil Simon suggests considering these ten questions as a guide.
  • A successful data-driven business fosters a focused, collaborative culture; has leaders who believe in data and are management oriented. Find out more in this brief overview TDWI research, which reveals recommendations for becoming data-driven.
  • It's time to move on to summer practice and consider such a common and understandable task as crop rotation planning. What happens if you add a pinch to this task? artificial intelligence and a few grams of mathematical methods?
  • At the end of spring, SAS Russia held its first Trainee Day. This new format meetings for students and graduates who have successfully passed all stages of selection for the SAS internship program and have already started working in our team.
  • To find out more about the concept of personal data, why it's in the news and why it's heavily regulated by the General Data Protection Regulation (GDPR), we spoke to Jay Exham, privacy lawyer at SAS.
  • How can you improve the efficiency of business processes, from production to storage and distribution, using information technology?
  • SAS analytics will help insurance companiesHow to apply advanced analytics and machine learning in health insurance?
  • Data preparation is the process of combining it, bringing it into a single format and cleaning it for the purpose of further analysis and solving other business problems.
  • Data quality is not good or bad, high or low. This is the range or health indicator of the data flowing through your organization.
  • From cows to factory floors, the IoT promises intriguing opportunities for business. Find out how three experts envision the future of IoT.
  • What is a data lake? Is this just marketing hype? And in general, how does it differ from a traditional data warehouse?
  • Data profiling, the act of monitoring and cleansing data, is an important tool organizations can use to make better data decisions.






2024 gtavrl.ru.