In this episode Angela Horton Director of Data Quality and Accuracy at a large Health Care Organization walks through the data life cycle and shares information about the importance of data quality and data governance. Listen to this episode to learn how Angela got where she is and how data is used at the health care organization she works for.
- Hello, and welcome back to another episode of
Buffalo State Data Talk. I'm your host, Heather Campbell, and thank you for joining us. Today, we're talking to Angela Horton,
the Director of Enterprise,
Data Quality. and Accuracy at a large healthcare organization. Angela has worked in the
field for over 10 years and was recently awarded the
Women in Technology BETA Award from InfoTech Western New York.
Congratulations on that.
- Thank you. - And thank you so much
for joining us today. - I'm so excited to be here. Thank you for having me. - Let's get started.
Could you give us a general overview of the work that you do? - As the Director of Data
Quality and Accuracy, my team really focuses on
assessing the data quality in our source systems,
and that's everything
from creating rules that
identify data quality issues to monitoring those data quality issues. But furthermore, sending our
data owners and our stewards those issues for them to go
improve in their source systems, which really increases the quality of data
in our data sources. - It sounds like you do everything having to do with data a little bit. (chuckling) - Yeah, so a lot of
understanding and profiling
of that data, finding anomalies
is a big part and portion of what my teams do and really
they bring those anomalies that the data owners
may not have even known that existed in the source systems. And we also gather a lot
of those requirements
from data scientists because they're in that data all day long and they're finding a
lot of these anomalies, so it's a nice partnership
between data science, the operational folks,
and my data quality team.
- Now that we have a bit of a
general idea of what you do, could you tell me what a typical day or a typical week looks like? - Yeah, so being the Director, I sit at more of a higher
level view in vantage point.
A day in the life of myself
would be lots of meetings, lots of coordinating, lots of directing of different people
and folks and processes and getting them all on the same page and getting buy-in a lot of
times from different areas.
I consider my job more like a
traditional traffic director, I am directing people
from going this direction, that direction, getting
all the right people moving into the same fashion. And so, I probably spend
80% of my time in meetings
of all sorts and different kinds. - Excellent. - But I would say when it
gets to more of my analysts and what their day-to-day looks like, which is probably more
interesting for the folks
that are listening in on the call, they're integrated very much into the data and they're mining millions
and billions of pieces of data at a time, working with different tools, and really finding those anomalies,
and then working with the
different constituents throughout the organization to understand if these are anomalies,
what can we do to fix those? Is it a people, process,
or a technology issue that we kind of need to solve for?
They get the more fun job, in my opinion. - Of all of that huge amount of data that your team is working through, where does the data come from? What kind of data is it?
- It's all over the board, so
it's patient-centered data, it's our practitioners and our providers, we have claims data, we have what you'll hear maybe
in the industry as HIE, which is health information
technology and data.
It comes from EMRs, it comes
from claims processing, it comes from enrollment, it
comes from all over the place. We have vendors that submit
data to us oftentimes. There's a wide range
of various data sources and people that own those
various data sources.
- Once your team has
either collected the data or been sent the data and they
clean it, what happens then? How is the data actually used? - That's a great question and something that I care
very passionately about.
I started my career in
analytics 10 years ago, and I've done everything
from business intelligence to clinical analytics to client analytics, and that's really the heart of where this data is being used for.
After my team identifies the issues and we work with our
partners to cleanse the data, that really is the downstream impacts of what our data scientists,
our client reporting teams, our clinical analytics teams are using
to do their predictive models,
whether it be predicting have heart problems in the future or providing lists to our
care managers to do outreach because maybe this person
isn't quite high risk now, but they could be in the future
based on their predictive model. And so, I always say
the cleaner the data is, the more accurate their
predictive models are, and better statistically
significant they will be, and the better outreach that we can have
for our patients and customers. - And what kind of software technology or programming languages
do your analysts use to clean and analyze the data? - There's a bunch out there.
Gartner Magic Quadrant, for those who don't know Gartner yet, they're a great organization for understanding the
leading industry competitors and companies and vendors, but
one of them is Informatica.
And so, we use Informatica
to do that identification and that cleansing and that monitoring. Our data scientists though, however, use a lot of R and Python. They use SPSS to a certain extent.
That's an IBM product, which really does some
good statistical analysis. We use Tableau, Power BI. I mean, there's a lot of
really great tools out there to do dashboarding.
I would say there's an ETL aspect, which is extract, transform, and load, which is basically how you get the data into those databases. I know my data quality team picks that up,
or sometimes we do it in motion. That means evaluating the data before it even gets into the database, making sure it's cleansed and clean. And then, that's where the data scientists
and the analysts pick it up
using things like R and Python and Power BI and Tableau to
do those different models and dashboards and all that fun stuff. - You've mentioned a
lot of different groups that help move the data
from the collection stage
all the way to analysis. Could you talk briefly about what those different groups are, their names or typical titles, and what they actually do in the process?
- Yeah, so it always
starts, in my opinion, with a combination of a data
analyst and a developer, and the data analyst's role really is the person who is going to profile. Say you have a set of data from a vendor
that you would like to
bring into a database. The data analysts will go
through and profile that data and understand typically what the nature of that information is. Are there duplicates in the data?
Do I need to transform it
or map it into a certain way that it should look? That person really kind of puts together a requirements
document, in a way, saying I have these 25 data elements
and we need to conform them and map them maybe into the standard
that we already have in our warehouse. That requirements document then gets provided to a developer,
which that's where the
ETL comes into place. Again, that transform and load process, and they'll code in
something like PowerCenter or, I mean, Alteryx, or SSIS. I mean, there's several
different ETL tools
out there to use. They'll take those requirements documents and start to code and
bring that data through. Generally for a lot of organizations, there's also a quality assurance area
that will take the code from the developer and make sure that it's in line with the requirements document. Their role is to make sure that everything that the developer is coding is up to spec
with what the analysts have documented in terms of requirements. Once all of that is done, then we do something
called the SDLC process. It's basically promoting code
through a development area
and then a quality assurance area, and then finally a productionalized area. Once quality assurance signs off, that data goes into production. And then that's where the
folks like the data scientists
and the analysts, people
doing the dashboards, operational folks can really go and start to query and analyze the data. I'd say that's a good kind of workflow of how to get the data through the process
into actual usable data. - Are you able to set time aside for your own professional development? And if so, what kind of
activities do you do? - Yes.
And again, leading by example, that's what we should do as leaders. It's important for me to
carve out that time for me so my teams know that it's okay for them to carve out that time for them.
It's like, we're at
the grind all day long, but you know what, every
Friday, carve out two hours for some development time for you, and I'm doing it for myself, so I want them to do it
for themselves, as well.
Udemy has a lot of really great courses and a lot of times,
many of them are on sale for like 80% off. There's LinkedIn Learning
as an option too. There's a lot of really great
resources on YouTube, as well.
If you're into Power BI,
there's Guy in a Cube, I think his name is, and
he walks through a bunch of different learning opportunities and there's a lot of really
nice resources out there. I personally take advantage of those still
and they're very helpful for me. - Excellent. Maybe you can send me a link to a couple of your favorite ones and we can post them in the description of the episode.
- Yeah, I definitely will. - You previously worked as
a data governance manager and data governance is
a really important part of data analytics, but something that's not always
necessarily talked about.
Could you tell us what
exactly is data governance and how does it play a role in your work? - Yeah, so data governance is, in my mind, a component of information management. I would say, so taking
kind of a little step back,
information management is
everything from data governance to data quality, reference
table management, reference data management,
standards, security, data use policies, master data management. All of those things kind of roll up
into what information management,
the discipline, really is. And so, a component of
that is data governance. DAMA, which is Data
Management Association, which is a well-renowned
association in the data world, they define it as planning,
oversight, and control
over management of data and the use in the data-related sources. Included in that are security,
the data use policies, stewardship committees,
custodian responsibilities. When I spoke about working
with our data owners
and our stewards on fixing that data, that's where data quality
and data governance really have a nice partnership because as we're
identifying the data issues, then we work with the data
governance team to say,
okay, you have your data
stewards and owners, well, let's work together
to try to get those teams to resolve and fix those issues. Master data kind of also
plays in line with that too. Mastering your data means
that you kind of have this golden record. Heather is Heather. No matter which company she
has gotten insurance with, she's always just gonna
be Heather in the system with one master ID,
because if you're moving
in and out of somewhere, oftentimes, you'll get
several different IDs and it's really hard to do
a longitudinal analysis. If I've gotta know that you've got 25 IDs, mastering your data really
brings the best data for you
up to light and allows the
analysts, the data scientists to really understand that that is Heather and all of her information is going to be under that one master ID. But there are oftentimes
when you're mastering data
that the technologies,
while they're really great at being able to auto collapse
and merge and split things, sometimes it's just not always that easy, so you need a data owner or a steward to do some clerical review
on maybe some suspect records
and have them manually go in
and collapse those records for you or for me or whomever. And so, that's where the partnership between data governance and
master data comes into play because the technology
solution may not always be able
to do that automatically, but the stewards and that real human eye
and that special touch can really go and make
those determinations. And so, they partner
with data governance too. Data governance is really at
the heart of managing data.
- It's something that's
very important and needed for you to make the most and
best analysis of your data. - Yes, I think so. There are two really great
pieces of literature out there. One is Get Governed by Morgan Templar
and another is Non-Invasive
Data Governance by Robert Seiner. And both of those books
I've read front to back and they are fantastic
data governance books if anybody wants to get up to speed
with the importance of data governance, kind of some playbooks, and the best approaches
to data governance itself. - Excellent. And we'll link those two books
in the description of the episode too. I know I mentioned you worked
previously in data governance, and I just kind of wanna talk a little bit about your career path now. You received a Bachelor's Degree
from Buffalo State in Mathematics. Let's just start there. How would you say that your
background in mathematics affected your career path? - I personally think mathematics
teaches you problem solving,
how to think outside the box with different problems, 'cause with math, there's often not a black
and white way to get there. And so, in mathematics, if you're able to solve a
problem several different ways,
then when you enter into the workforce, there's very rarely a time
where a problem you face in your professional career that will have a black and
white way to get there. I feel like I've learned from
all of the courses I've taken
at Buf State, whether
it be numerical analysis or calc three, or, you're
bringing me back now, but they really taught me
how to think outside the box. Some advice, if I could
give some for our listeners, would be don't be afraid
to be a fresh eye in an organization, don't be afraid to ask questions and kind of poke holes
in some of the processes because what you're gonna find is that your outside the box thinking
is going to be refreshing
to the organization and
they'll start to find new ways and they'll implement your ideas into doing things a
little bit differently. And I truly believe that
mathematics helped me do that. - What made you first interested
in moving into your field?
- Well, I started as an intern
at a healthcare brokerage. I don't know if our listeners know much about the difference
between healthcare industry, a payer versus a broker,
but there are brokers that work with the employer
groups to sell them insurance
and they try to pick the
best insurance company that makes sense for that employer group. I started an internship with a brokerage and I was doing data entry at the time. They were a small, very small broker,
but I learned so much
about how to use Excel and what data entry was. And then I started moving into more putting that data entry into databases and starting to analyze it.
And that's when I really realized I wanted to do data analytics because what I really enjoyed at the time was making PivotTables
and uncovering some trends that my boss at the time
didn't really know were even out there. Uncovering those patterns
and those anomalies that no one would know of, unless somebody would dig into them, that's when they realized
that this was for me.
And then from there, I started at Blue Cross Blue
Shield of Western New York as an employer group reporting analyst. And then I kind of moved
into business intelligence, and then I got into clinical analytics,
which kind of gets into data science. And then through my path, I
realized how dirty the data was and how difficult and how
much time people spend on cleaning data as analysts. And so, I said, I feel
like I need to be part
of a data management group
or information management or data quality or whatever it is because I need to be part of the solution to cleaning and cleansing the data so that my colleagues
down the road don't have
to spend so much time doing that work and they can actually
spend that time analyzing and doing the things that they love to do. - From the way you're
talking, it was really fun 'cause I could tell you
were passionate about it,
when you first saw those PivotTables, and it is really cool to take some data and then you find, I got these results and they could make
real actionable changes. - Yes.
- It's a really awesome feeling to have, and to be able to help in that process, cleaning the data so
that you can get there. It must be very rewarding. - It is very rewarding.
- That's awesome. It's great that you're
passionate about what you do. - Yes, very. - Many of our listeners are younger people who are students that are still in school.
As somebody who is in a leadership role, who has had people working underneath her, and has mentored people before, what advice would you give somebody who's interested in working
as a data analyst or a data scientist? - Yes. I think there are a lot of great resources by reaching out to folks
on LinkedIn, for example, I've done that in my past.
I have searched maybe my dream
job or a position or a role. You can type in the search
engine, data scientist, and up will come a bunch of
people that do that work. And I would encourage
listeners to reach out to them, message them, and say, hey,
I'm just curious to learn
about what you do at your organization. Can you share some information? I've done that myself and
I found it very valuable and you end up having
now a network of people to maybe bounce ideas off
of, so it's kind of twofold
in terms of learning different roles, but also creating that
space and that network, especially in this virtual
world that we live in, that's a good way to do it. There are a lot of groups too
out in the LinkedIn space,
so our listeners can ask
to get involved with. There's also an app called Clubhouse. Have you ever heard of Clubhouse? - Yeah. - Yeah, so Clubhouse is a
really great option, as well.
And it's like a, I would describe it as a
real life, or a live podcast, non-video, but people just
getting together in a room to talk about a topic. - Are there a lot of Clubhouse groups
for data analytics and stuff like that? I'm not familiar. - Yeah, you would be surprised. Lots around data analytics, data governance, there's
a lot, data management.
Yeah, I would have people check it out. - Finally, before we let you go, is there anything else you'd
like our listeners to know that we didn't get a
chance to cover today? - No.
I would say just keep pursuing your dreams and just for them to remember that success is at the eye of the beholder and success means something
different to all of us and as long as we're all happy, for me,
I tell many people that
that's success right there is just to be happy and that's it. - I think that's excellent advice. Well Angela, thank you so
much for joining us today. - Thank you. Thank you
so much for having me.
This was amazing. - And to all of our listeners, if you haven't already, check
out our previous podcasts. They're available wherever
you listen to podcasts. And for more information
about starting your career
as a data scientist, go to
dataanalytics.buffalostate.edu and don't forget to subscribe
so that you get notified each time we release a new episode of Buffalo State Data Talk.
Some content on this page is saved in PDF format. To view these files, download Adobe Acrobat Reader free. If you are having trouble reading a document, request an accessible copy of the PDF or Word Document.