A New Era for Data Analysis in Qualitative Research: ChatGPT!

Today, the use of software in qualitative research analysis is rapidly becoming widespread among researchers. Researchers manage large data sets using features such as editing data, transcribating, creating codes, and searching within data. However, while the data analysis uses software in a format, the analysis of the essence of the data is done by researchers. An AI language model, ChatGPT, released by the OpenAI company, has features such as text editing, creation, and abbreviation. In this research, considering the characteristics of ChatGPT, an answer was sought to the question of whether it can be used in data analysis for qualitative research. In this research, the case study of the qualitative research method was preferred. The data of the research consists of interview texts of two participants from an unpublished study. The texts were subjected to qualitative research analysis process through ChatGPT-4. Data analysis was done in two separate ways, specifying code, category and theme and not specifying. In conclusion, it has been found that ChatGPT can create code, category and theme, cite directly from within the text, interpret data sets, and analyze the meaning at the core of the data sets. In this context, the availability of ChatGPT in data analysis in qualitative research has been discussed.


Introduction
Qualitative research is a method which is based on determining what the phenomenon means beyond determining the cause and impact that has been emphasized in the last quarter century or defining some distributions, not only on theoretical knowledge, but also on a philosophical background.Thus, the traditions and approaches leading to qualitative research vary.Data collection, data analysis, the researcher's epistemological and ontological attitudes, the nature of the research question, many hidden and open impacts are correlatively affecting each other (Clarke et al., 2021).Therefore, in qualitative research, results are focused on individual, not numbers.
Qualitative research has the potential to present different results when the same research is done at different times and in different environments.The common element of this method, which is quite flexible and can vary depending on the situation, can be addressed as the way each researcher will be monitoring when analyzing.After acquiring data for qualitative analysis, the researcher must first organize the data, categorize and encode them to process and develop a system to deliver the data.(Clarke et al., 2021).To make this easier, Computer Assisted Qualitative Data Analysis Software (CAQDAS) is consulted with the advancement of technology.
Although it is known that software created to assist in the analysis of text and content in human sciences has been used since 1960, it is often found to be used for statistical purposes up to 1980 (Cypress, 2019).The use of CAQDAS in qualitative research has been a topic of discussion between its defenders and critics since the day it emerged.(Jenskin et al., 2023).Today, technology advances and the concept of time for a technology-era individual are becoming very important, making it possible to use software for other than statistical purposes.
In qualitative research, analysis of data can be done manually, of course, but depending on the changes and developments mentioned, many qualitative researchers today may be using various software for qualitative data analysis (Cypress, 2019).
Overall, CAQDAS covers supporting, complementary programs and software that enable data management and analysis to be objective in qualitative data analysis (Vıgnato et al., 2022).In the last 20 years, it has been seen to grow based on its active use for qualitative and mixed-method research (Clarke et al., 2021).CAQDAS incorporates many software, such as NVivo, ATLAS.ti,MAXQDA, which offers many advantages in terms of data organization and the efficiency of coding in qualitative and hybrid studies The increased auditability of the data, allowing multiple researchers to encode, has many positive features for qualitative research in terms of the rigidity and reliability of the findings (Clarke et al., 2021).Since CAQDAS allows large and complex data to be organized and simplifying analytical processes such as word, text search, researchers can spend more time on understanding and specific analysis of data (Jenskin et al., 2023).
As the CAQDAS packages have evolved rapidly over the last decade, especially new users, have a number of problems with the use of CAQDAS.It can be a problem for users to choose which software to use based on their needs, when to learn which software, and what it costs.Also, qualitative researchers who are not capable of using computers can avoid using CAQDAS.It may take time to use software for researchers who are also capable of using computers, which may prevent the use of CAQDAS (Niedbalski & Ślęzak, 2022).While CAQDAS use provides researchers with benefits in transkript, edit, classification, etc., it is also known that it is not able to find meaning, interpret data, analyze the content of data (Kalpokas & Radivojevic, 2022).This deficiency in CAQDAS may be fulfilled by Artificial Intelligence (AI) technologies that have evolved in recent years.
ChatGPT is a language model developed by OpenAI, an AI research and distribution company based in San Francisco, California.Based on the GPT-3, which the company launched in 2020, ChatGPT is set up to engage in dialog with the user (Stokel-Walker, 2022).ChatGPT is a prototype dialog-based artificial intelligence chat robot that can understand human natural language and produce highly detailed, human-like text (Lock, 2022).ChatGPT's dialog format has the ability to answer questions that follow, accept errors, question false assumptions, and reject inappropriate requests (OpenAI, 2022).The description ChatGPT made about itself was; "ChatGPT is a language model developed by OpenAI.Language models are artificial intelligence systems that try to understand how language works and are used in applications such as language processing, text prediction, and language translation.ChatGPT is designed for a more natural language use of a chatbot and has the ability to chat with people.ChatGPT works as a language model and uses language examples while learning.In this way, ChatGPT can learn the structure and characteristics of the language that people use and use natural language to provide answers."(ChatGPT, 2022).
As of November 2022, when it was released under the name ChatGPT-3.5, it was followed with interest by more than 1 million people.ChatGPT is used in two versions, GPT-3.5 and GPT-4.For now, it is enough to enter the OpenAI site to use ChatGPT and log in using the mail address or Google account and log in to the phone number.To use the ChatGPT-4 version, it is necessary to pay extra.
There are significant differences between GPT-3.5 and GPT-4.GPT-4 uses more data than previous models and is more reliable.GPT-4 gives more accurate results than GPT-3.5 in complex operations.In addition, GPT-4 performs better in English and in different languages than other language models.Most importantly, it still is not fully reliable (it "hallucinates" facts and makes reasoning errors).GPT-4 is more successful in rejecting harmful (bomb making, harmful chemical components, etc.) requests (OpenAI, 2023a).Another important advantage of GPT-4' is that it has Plugins (Add-ons).GPT-4 accesses the internet via Plugins, reaching up-to-date information, viewing documents such as PDF (OpenAI, 2023b).
Millions of users used ChatGPT to write poetry, communicate with it, or answer simple questions (Sier, 2022).Most users using the app have been greatly influenced by Chat GPT's performance and potential (Haque et al., 2022).ChatGPT provides answers, solutions, and explanations to complex questions, including potential methods for solving maths problems, code writing and gaming, homework articles and more.ChatGPT can be used to easily extract code errors, and AI not only corrects the error, it also explains to the user where they are wrong and what they can do to correct them (Naidu, 2022).Also, ChatGPT has the ability to tell us what the disease can be when body disorders are mentioned, to simplify complex topics for children, to write stories based on what the user expresses (Özcan, 2022).As well as being one of the most powerful chatbots ever made, ChatGPT has the feature of providing text to it, abbreviating and detailing (Kirmani, 2023).ChatGPT is used free of charge as of December 2022.For now, to use ChatGPT, simply go to the OpenAI site and log in using email address or Google and enter the phone number with the incoming code.
Although ChatGPT has been released as a language model, it can be thought that it can be used as CAQDAS as ChatGPT has features such as text abbreviation, interpretation and detailed expression.In this research, the question of whether ChatGPT can be used in data analysis for qualitative research, taking into account its analysis ability, is sought.In this study, how ChatGPT is used in qualitative data analysis and how it performs qualitative data analysis was investigated.For this purpose, answers to the following questions were sought; • How is ChatGPT used in qualitative data analysis?• What are the qualitative data analyzes that ChatGPT has done?

Research Design
This research uses a case study design from qualitative research method.The case study is a qualitative approach where the researcher gathers detailed and in-depth information about real life, an up-to-date situation, or circumstances over a specific period of time, depicting a situation or creating themes (Creswell, 2013).In this research, the case design was preferred because it examined how the ChatGPT-4 model (CG) performed qualitative data analysis.

Study Group
In this study, the interview documents of an unpublished study subject to the university compliance process were used to evaluate the analysis process of CG.Since the language of the interview is Turkish, texts were uploaded to CG in Turkish and the analysis of the texts was made in Turkish.Detailed description of the interview sets used is included in Table 1;

Data Collection Tools
In this study, the data obtained from the CG and the data included in the work group were obtained in written documents, so the document was used as a data collection tool.Documents are among the data collection tools that researchers frequently prefer to as they provide benefits such as include detailed information, access to large amounts of data, provide information about unobservable situations, and be cost effective (Çelebi & Orman, 2021).

Data Collection Process
Interviews for research subject to the university adaptation process have been transcripted.Transcripts are uploaded in CG in Turkish and certain commands are given to CG.The Chat GPT-4 model was used in the data analysis process as it responds more detailed, reliable and accurate to longer commands.In exchange for the commands written by the researchers to prevent data loss, the data obtained from the analyzes made by CG were copied from their own intermediate and stored in digital media.The steps taken in the data collection process are given below; In this step, CG was given theoretical information, categories were specified and analysis was made within the boundaries drawn.A direct quote was requested for the created codes. 4th Step Beyza's 1st Analysis At this step, there is no limitation to CG.The text was asked to create code category and theme.A direct quote was requested for the created codes.

5th Step Beyza's 2nd Analysis
In this step, CG was given theoretical information, categories were specified and analysis was made within the boundaries drawn.A direct quote was requested for the created codes. 6th Step Joint Analysis of Beyza and Ali In this step, CG was asked to show similar and different aspects of the analyzes he had previously done.

Validity and Reliability
The measures taken to increase the internal validity of the research are explained in detail in the collection of data.Data from documents 1 and 2 & CG are given by quoting directly in the findings section without changing.In order to increase external validity, the research pattern, working group, data collection tools, implementation process, analysis of the data obtained, how the findings are organized and the role of the researchers are explained in detail in the relevant sections.In order to increase the internal security of the research, that is, its consistency, the data obtained from documents 1 and 2 were analyzed in two different ways and the code and category created, themes were checked.In order to increase the external trust of the research, the research data were properly discussed in the results section.Whether the results and findings section provides consistency has been discussed and agreed among researchers.The conclusion was confirmed to a qualitative research specialist whether the part of the findings was consistent with the discussion section.

Findings Analyze Like Chatting!
We copied and pasted the data into the chat box on the interface of CG to analyze the data sets we used in the research.We added and sent the question we wanted to ask before the data we put in quotes.We did not start a new conversation for the questions we asked in progress, and after the first question we asked other questions.We worked on other datasets using "New Chat" to analyze other data.In addition, the old "Chats" we have done is stored on the left side of the interface.CG's analysis of our data took place in the range of 1-20 seconds.Figure 1  In the first step of our research, we explained the actions to be taken to CG.For this, we commanded what subject we were researching, who was speaking, what we wanted from CG.We added the quotation marked speech under the command we wrote.While the speeches are sub-subject in the transcript of the interview, the speeches are reflected consecutively on the screen when the text is added to CG.However, this does not adversely affect CG's analysis process.After defining the purpose of the research, we asked CG to divide the text into themes, categories and codes for our qualitative research.We copied the conversation text from the Word file and pasted it into CG's interface.

ChatGPT's First Analysis
CG's analysis of the text after the first command is written can vary between 10-20 seconds.CG analysis can interrupt analysis in analyzes after long commands.When this situation is encountered, it is possible to continue the analysis with the option "Continue generating".In addition, CG responds again by offering the option "Regenerate response" if its response is not liked.The analysis made by CG after the first command is as follows;

Figure 3 First Analysis Results
When Figure 1 is examined, it is seen that CG first explained the actions it took.It is then seen that it creates four themes, creates codes under themes, establishes a relationship between codes and themes, and creates categories.It also appears that the theme, code and category as a note make a statement regarding the use of qualitative research.
In this section, the text was analyzed, but no finding was found as to which code the participant obtained from which statements.

Detailed Explanations, Requesting Direct Quote
In the first analysis of CG, we asked CG to make a detailed explanation as it was seen that it only included the names of the themes, codes and categories.We asked what statements the participant obtained under the themes.For this, we used the command in Figure 4; When we asked CG to explain which code it obtained from which statements of the participant, we found that CG's analysis was quite correct.When we asked CG to explain which statements of the participant obtained, which code he obtained, we found that CG's analysis was quite correct.It is seen that CG creates codes by extracting meaning from the expressions in the text during the analysis process.Direct quotes selected for the generated codes appear to be extremely accurate.It has also been determined that CG produces different codes from the same paragraph.He also used multiple expressions in the text when creating the CG code and made explanations as to why he created the code.When we repeat this process for all other codes and found that CG transferred direct quotes from the text to a large extent.Figures 5, 6, 7 and 8 contain quotations and codes made by CG; At this stage, we found that CG made 3 different mistakes when citing directly.The first of these errors; CG has used two direct quotes for the code "Increasing In-Class Participation" under Theme 2 (see.Figure 6 painted area).The first of the direct quotes used belongs to the participant, but the second statement used is the repeat sentence that the researcher said to confirm the participant.It seems that CG cannot distinguish between the researcher and the participant here.
We saw the second error when we commanded CG in a row and came to Theme 4. It cited three different direct quotes under the code CG "Change in Perception of University".However, we noticed that the second of these quotes was not sufficiently detailed (see.Figure 8 painted area).
When we came to the code "Increasing Participation in University Events" under Theme 4, what statements did you get from the participant's code "Increased Participation in University Activities"?" we commanded, but realized that CG used excerpts that were not included in the statements in the text.We repeated the command on using different quotes.Upon repetition of the command, CG corrected the error by directly including the participant's own statements.When commanded for the code "Shift in Perspective Towards the Department" under Theme 4 (12th command after the text, we saw that it did not cite the participant directly.The dialogue in Figure 9 then developed:

Repeating the Analysis
After completing the first step in analysis with CG, we opened a new chat and analyzed the same text again.CG achieved different results when it analyzed the same text.The difference between the results was often due to the way it was expressed.In the second analysis, the code category and themes in the result are included in Figure 10; We asked CG as in the first analysis of what expressions the generated codes obtained from the participant.We asked for a direct quote separately for each code.For this,we used the command "From which expressions did you get the code "..." under Category ...?".We checked CG's samples over the text as in the first analysis and saw that CG cited directly.At this stage, we encountered errors similar to those in the first analysis.When we come to the direct citation command of the last code (12th command after the first one) CG gave the statements that are not in the transcript as examples.After the error was stated, we sent the transcript again.And then CG corrected its mistake by citing the text directly.While CG created 4 different themes and 12 codes in its first analysis, in the second analysis it used the word "category" instead of "theme" and created 11 codes under categories.In its second analysis, the results were different in the form of the presentation.However, it clearly stated the themes to which the categories and categories to which the codes belong.Analysis of CG's themes in the first analysis and their codes under it, and analysis of the categories and codes in the second analysis are included in Figure 11;

Figure 11 Comparing 1st and 2nd Analysis
When Figure 11 is examined, it is seen that the naming "theme" in the first analysis and the naming "category" in the second analysis are similar.According to Figure 11, the code " Shift in Perspective Towards the Department" in Analysis 1 is included in both analyzes.However, there is a difference between the category in which analysis 1 is included and the category in which analysis 2 is included.
Although the codes created in the first and second analysis differ as names, they show a very close importance in terms of the meaning they carry and the direct quote they are obtained.Figure 12 exemplifies this situation;

Specifying Categories
At this stage, we re-analyzed the text by opening a "new chat".This time, before uploading the interview text, we gave CG a summative paragraph of theoretical information about the university adaptation process.We shared short information with CG in this theoretical information text without going into the details of the theory.The theoretical information provided is included in Figure 13; The analysis made by CG after this command is seen in Figure 14;

Figure 14 Analysis by Categories
In Figure 14, when we asked CG to specify the categories and create code, it created 17 meaningful codes that fit 6 categories.These codes differed in terms of naming and presentation according to the codes in the first and second analyzes.When we examined the content of the codes, we found that it was similar to other analyzes.At this stage, we have commanded to receive direct quotes from the participant's statement regarding each code in turn.CG gave the statements that were not in the text of the speech as an example when it came to the last code (13th command after the text) as in the first and second analysis.After we stated the error, we sent the text of the speech again and provided a direct quote from CG's text and the error improved.The sections that CG cited directly on the codes showed parallelism with the sections quoted in the first and second analysis.

Making a Consecutive Analysis
After analyzing the first participant with CG, we proceeded to the analysis of the other participant in the research.For this, we first used the continuation of Chat, where we made the first analysis.We commanded CG 18 consecutively for the first participant's transaction.In the 19th command in this Chat, we wrote the command in Figure 15.

Figure 15 Analysis of the Second Participant
CG created different code, category and theme after analyzing the data of the second participant.We performed the same actions for the second participant, such as asking for a direct quote to the first participant and analyzing by categories.Then we commanded CG to specify the common codes of the first and second participants.CG stated at this stage that it did not have information about first and second participants' code.

Figure 16 Error Message
After the error message, we opened "New Chat" and uploaded the analysis results of both participants with categories.Then we gave CG the command in Figure 17;

Figure 17 Binary Comparison Command
Following this command, CG gave the following results in tabular form; When Figure 18 is examined, it is seen that CG can identify common codes, categories and themes in previous analyzes, reveal similarities and differences between participants, and make explanations about similarities and differences.
CG's ease of use distinguishes it from other CAQDAS.It is necessary to learn how to use many of the CAQDAS before using it effectively.However, explanation of the action to be made via "New Chat" is sufficient to use CG.It has been found that it is sufficient to tell CG the purpose of the research and to indicate the action to be taken for data analysis.
In addition to ease of use, formal deficiencies in interview texts uploaded to CG for data analysis (inability to copy down people's speeches, spelling errors, etc.) does not interfere with data analysis.Articles, tables and outputs from CG can be copied completely from CG's intermediate face without being corrupted.

Discussion
Computer-assisted qualitative data analysis software (CAQDAS or QDAS) has been available since 1980.Nowadays, using software for qualitative research is becoming increasingly common for researchers (Cypress, 2019).John and Johnson (2000) discussed the benefits of using the software in qualitative data analysis, such as getting rid of manual editing tasks, time, using large qualitative data, and increasing the validity and control of qualitative research.The widely used qualitative research software has features such as encoding, consolidation, data search and query, data visualization, transcription, statistical analysis by the researcher (Clarke et al., 2021).Software such as ATLAS.ti,NVivo and MAXQDA have added automatic analysis tools to their current versions.Using word frequencies, word clouds, and autoencoding capabilities, researchers can get a bird'seye view of data content before proceeding to deeper analysis (Paulus, 2022).However, these programs are not sufficient for the search for meaning, which is the essence of qualitative research.Though existing software has features such as finding the words in the qualitative data set, the relationships between the words, the frequency of the words used, it is seen that they do not have the analysis capabilities to interpret what the data contains and to interpret the data.(Kalpokas & Radivojevic, 2022;Vıgnato, et al. 2022).
CG's ease of use distinguishes it from other CAQDAS.It is necessary to learn how to use many of the CAQDAS before using it effectively.However, explanation of the action to be made via "New Chat" is sufficient to use CG.It has been found that it is sufficient to tell CG the purpose of the research and to indicate the action to be taken for data analysis.In addition to ease of use, formal deficiencies in interview texts uploaded to CG for data analysis (inability to copy down people's speeches, spelling errors, etc.) does not interfere with data analysis.Articles, tables and outputs from CG can be copied completely from CG's intermediate face without being corrupted.
CG's ability to operate in different languages allows researchers to use their datasets in their native language when analyzing.In our research, it was seen that CG has successfully completed the analysis process and did not make semantic disorders, although all of the transactions with CG were done in Turkish.In this context, it can be said that CG can analyze in a different language other than English.
As seen in our research, CG can analyze the data (may vary depending on text length) uploaded to it in seconds.CG analyzed between 1-20 seconds for the data we used in our research.It is possible to say that this analysis period is much faster than a normal researcher's qualitative data analysis time.However, CG can interrupt the response in cases where the responses are long.In cases where the response is interrupted, the response with the option "Continue generating" can be sustained from where it left off.In cases where the answer is not liked, the response can be repeated with the option "Regenerate response".With this feature, CG offers the researcher different response options.It can also show disruptions in understanding CG commands and responding correctly.However, in such cases, repeating and explaining commands clearly can solve the problem.
When we asked CG to create code, category and theme without giving any information, it was determined that CG could create meaningful code, category and themes.The existing qualitative research analysis software used can create codes from within the text given to it, but cannot explain why it creates the codes (Kalpokas & Radivojevic, 2022).CG also explained why it created the codes it created and stated why it used these codes in which categories and themes.It was also found that when we provided theoretical information to CG and specified the categories, it collected the appropriate codes under the appropriate categories and themes.
In this context, it can be said that researchers can analyze using CG according to the categories and themes they have previously determined.
In different analyzes, it has been observed that CG can use the concepts of category and theme interchangeably, but codes are always used under categories and themes.It is possible to say that the created codes are created in accordance with the relevant categories and themes.CG has been shown to give different results in repeated analyzes of the same data.These differences have generally not been differences in the essence of codes, categories and themes.In the second analyzes made on the same data, it was seen that there was a differentiation in the naming and the number of codes created.Although the naming is different, it has been observed that direct quotes for two similar codes are selected from the same statements.In this context, it can be concluded that CG names the code of the same paragraph differently, but considers the meaning to be the same.It has also been determined that some codes can be under different categories.These differences can be compared to the differences that researchers who analyze the same data set at different times can make.
When direct quotations for codes created from CG were requested, it was found that he was able to present the quotations to the researcher correctly.Although CG was successful in getting direct quotes, it has not always been functioning correctly.When commanded in a row, CG was found to be able to behave incorrectly.In such cases, the error must be specified by the researcher to CG, and the process must be repeated if necessary.Although there is little margin of error in this regard, researchers are recommended to check direct quotes.In addition, it appears that CG has the ability to improve its analysis by asking for further explanation when it has problems when analyzing.
CG was found to be able to make mistakes when long-term actions were taken on the same Chat page.Following the analysis of the first participant, switching to the analysis of the second participant resulted in the inefficient use of CG.In this context, it is more appropriate to analyze the participants in different "Chat" tabs during the data analysis process.
Compared to the analyzes of the two different participants, it has been determined that CG can analyze similarities and differences correctly and explain the analysis it made in a meaningful way.In addition, when the analysis of more than one participant was wanted to be compared, it was seen that the type of analysis in which the categories were specified gave more functional results.

Conclusion
The use of CAQDAS for qualitative research analysis has advantages and disadvantages in various aspects.While researchers using CAQDAS can take advantage of features such as editing, transferring, and transcripting data, they experience inadequacy in matters such as analyzing data, interpreting it, understanding the essence of the data.CG and AIbased software likely to be done in the future offer new advantages, disadvantages and opportunities for qualitative research analysis.In this research, analyzes were conducted with limited written data to reveal the potential of CG, an AI-based software.As a result of the processes, it was observed that CG had features such as revealing the meaning in qualitative data analysis, understanding and interpreting the essence of the data.CG was able to analyze the interview text, extract code, category and themes, and include direct quotes from the text, in accordance with the purpose of the research.It also explained the reasons for its actions.It has analyzed the data from different participants and has been able to identify similarities and differences among these analyses.
As a result, it can be stated that CG can be used as an auxiliary software for researchers in qualitative research data analysis, meaning emergence, code, category and theme creation processes.

Suggestions
The datasets used in this research are limited to the data of two different participants.In order for CG to be used in qualitative research analysis, it may be suggested to be tested with various research in different patterns and content.Since the native language of the researchers in this article is Turkish, all transactions were made in Turkish.
Researchers with different mother languages may be recommended to use CG in their native language.Due to the disruptions seen as a result of long-term command writing, it may be suggested that no consecutive analysis is performed on the same "Chat" and analysis can be made for each participant on seperate chats .Since the commands given to CG are limited in this research, it can be explored what the commands that should be written in different studies can be.It may be suggested to use the ChatGPT-4 version due to the technical superiority of different researchers.Various studies can be conducted on the validity and security of CG's analyzes.Since the use of CG in data analysis in qualitative research can reveal some ethical and philosophical problems, it may be suggested to discuss the ethical and philosophical aspects of this issue.

Figure 4
Figure 4 Command for Requesting Direct Quote

Figure
Figure 5 Theme 1

Figure 9
Figure 9 Indicating the ErrorAfter the dialogue above, we put the entire interview text back to CG.It then corrected his error using the participant's own statements while obtaining the CG code.

Figure 13
Figure 13 Theoretical InformationAfter this command, CG used statements that summarize the paragraph we shared.Then we gave CG the command to analyze by specific categories.

Table 2 Steps Applied
, there is no limitation to CG.The text was asked to create code category and theme.A direct quote was requested for the created codes.At this step, there is no limitation to CG.The text was asked to create code category and theme.A direct quote was requested for the created codes.It was done to confirm the first step.