بررسی مشکلات جست وجو و بازیابی اطلاعات در پایگاه های اطلاعاتی از جنبۀ ویژگی های نگارشی زبان فارسی مقاله

نویسنده: هماوندی، هدی ؛ نوروزی، یعقوب ؛ حسینی بهشتی، ملوک السادات ؛

پردازش و مدیریت اطلاعات بهار 1397 - شماره 91 رتبه الف (وزارت علوم/ISC (‎24 صفحه - از 1087 تا 1110 )

کلیدواژه ها: زبان فارسی بازیابی اطلاعات پایگاههای اطلاعاتی ویژگیهای نگارشی Writing Features Persian language information retrieval Databases

fa en

چکیده:

این پژوهش با هدف تشریح مشکلات عمده نوشتاری و معنایی زبان فارسی در استفاده از محیط‌های اطلاعاتی و تعیین میزان انطباق و توجه به این ویژگی‌ها هنگام جست‌وجو و بازیابی در پایگاه‌های اطلاعاتی فارسی و به‌ روش پیمایشی-تحلیلی و با استفاده از شیوه مشاهده مستقیم انجام گرفت. پس از مرور پژوهش‌های مرتبط، کلیدواژه‌های کاوش در قالب یک سیاهه شکل گرفت. هر یک از این کلیدواژه‌ها در پایگاه‌های اطلاعاتی مورد مطالعه شامل «پژوهشگاه علوم و فناوری اطلاعات ایران»، «پایگاه استنادی علوم جهان اسلام»، «پایگاه مجلات تخصصی نور» و «پایگاه اطلاعات علمی جهاد دانشگاهی» جست‌وجو و تعداد نتایج بازیابی‌شده ثبت گردید. سپس، به بررسی میزان انطباق پایگاه‌های اطلاعاتی با این ویژگی‌ها پرداخته شد. برخی ویژگی‌های نوشتاری و معنایی زبان فارسی سبب بروز مشکلاتی در بازیابی اطلاعات از پایگاه‌های اطلاعاتی منتخب می‌شوند. مواردی مانند پیوسته‌نویسی و جدانویسی واژگان مشتق، مرکب و مشتق- مرکب، گوناگونی جمع‌ها، واژگان دخیل و معادل آن‌ها در بخش نوشتاری و چندمعنایی، همنامی و... در بخش معنایی از این دست ویژگی‌ها هستند. فقدان پوشش مناسب ویژگی‌های یادشده در مراحل ذخیره‌سازی و پردازش و عدم آگاه‌نمودن کاربر از آن جهت اصلاح فرایند کاوش در مرحله بازیابی اطلاعات در پایگاه‌های اطلاعاتی مورد پژوهش، اثرات نامطلوبی بر فرایند کاوش و بازیابی دارد. یافته‌ها نشان داد که پایگاه‌های اطلاعاتی فارسی نسبت به ویژگی‌های نوشتاری و معنایی زبان فارسی توجه کافی نداشته و بسیاری از ویژگی‌های آن را در مراحل ذخیره‌سازی و پردازش اطلاعات نادیده می‌گیرند. با توجه به تاثیر این ویژگی‌ها در تعامل کاربران با پایگاه‌های اطلاعاتی، احتیاج کاربران فارسی‌زبان به ابزارهای کاوش بومی و پایگاه‌های اطلاعاتی که مبتنی بر ویژگی‌های زبانی خودشان طراحی شده باشد، بیش‌ از پیش احساس می‌شود. پژوهش حاضر با بررسی میزان توانایی پایگاه‌های اطلاعاتی فارسی‌زبان در پوشش برخی ویژگی‌های این زبان که در فرایند جست‌وجو و بازیابی تاثیر قابل توجهی دارند، نقاط ضعف و قوت این پایگاه‌ها را مشخص نموده است. نتایج آن می‌تواند در جهت بهبود و اصلاح عملکرد پایگاه‌های مذکور مورد استفاده قرار گیرد.

The present research was carried out with the aim of explicating the major writing and semantic problems of Persian language when using data environments and determining the degree of compatibility and attention to these features in Persian databases. This research is of survey analytical type being conducted through direct observation. Having reviewed the related literature, we kept a checklist of search keywords. Each of these keywords was searched in the databases under study, such as Iranian Research Institute for Information Science and Technology, Regional Centre for Information Science and Technology, NoorMagaz, and Scientific Information Database affiliated with Jahad Daneshgahi, and the number of retrieved findings was recorded. Some of the writing and semantic features of Persian language contribute to problems associated with retrieving information from the selected databases. Some of these features include connected and disconnected forms of writing of derivative, compound, and derivative-compound words, diversity of plural forms, loanwords and their equivalents in writing as well as polysemy, homonymy, etc., in semantics. For instance, retrieving different results for various writing forms of the keywords «فناوری و فن آوری» as derivative-compound words or «پتاسیوم و پتاسیم» as various forms of recording words, or retrieving different findings for keywords «دریای خزر، دریای مازندران و دریای کاسپین» as well as lack of their appropriate coverage as synonymous words and giving the user information about it in order to improve the exploration process, for it has negative effects on search and retrieval process. Findings indicated that Persian databases do not pay adequate attention to writing and semantic features of Persian language, and disregard many of its features in searching and retrieving information. In connection with the impact of these features on the interaction of users with databases, Persian-speaking users’ need for native exploration tools and databases designed in accordance with the features of their own language have become more and more urgent. The present research has examined the ability of Persian databases in covering some of the features of this language, which have a noticeable impact on the process of searching and retrieval, pinpointing the weak points and strengths of these databases. The results of the present research could be utilized to improve the performance of the above-mentioned databases.

دریافت فایل ارجاع :
(پژوهیار, , , )

دانلود PDF
دانلود HTML

صفحه:

ورود / عضویت

برای مشاهده محتوای مقاله لازم است وارد پایگاه شوید. در صورتی که عضو نیستید از قسمت عضویت اقدام فرمایید.

ورود

عضویت

تحتاج دخول لعرض محتوى المقالة. إذا لم تكن عضوًا ، فتابع من الجزء الاشتراک.
إن كنت لا تقدر علی شراء الاشتراك عبرPayPal أو بطاقة VISA، الرجاء ارسال رقم هاتفك المحمول إلی مدير الموقع عبر webmaster@noormags.com .

You need Sign in to view the content of the article. If you are not a member, proceed from part Sign up.
If you fail to purchase subscription via PayPal or VISA Card, please send your mobile number to the Website Administrator via webmaster@noormags.com .

لینک کوتاه: